Journal of Applied Psychology 


_Edited by Donald G. Paterson, University of Minnesota 


Consulting Editors 


George K. Bennett, Psychological Corporation 
Walter V. Bingham, Washington, D. C. 
Harold E. Burtt, Ohio State University 
Allen L. Edwards, University of Washington 
Irving Lorge, T. C. Columbia University 


James P. Porter, Danville, Illinois 
Julian B. Rotter, Ohio State University 
Edward K. Strong, Jr., Stanford University 
Donald E. Super, T. C. Columbia University 
Morris S. Viteles, University of Pennsyloania 


Quinn McNemar, Stanford University Alfred C. Welch, Knox-Reeves, Minneapolis 





Table of Contents 


Menstruation and Industrial Efficiency: I. Absenteeism and Activity Level. 
Smith 


Validity of Work Histories Obtained by Interview. EF. Keating, D. G. Paterson, and 
Study of Executive Leadership in Business. II. Social Group Patterns. C. G. 


Readability and Interest Values in an Employee Handbook. /. N. Farr 
Reliability of the Flesch Readability Formulas. P.M. Hayes, J. J. Jenkins, and B. J. 


The MacQuarrie Test for Mechanical Ability. IV. Time and Motion Analysis. 


The Pre-Engineering Inventory as a Predictor of Success in Engineering Colleges. 
F. Lord, J. T. Cowles, and M. Cynamon 


The Kuder Literary Scale as Related to Achievement in College English. 4. K. 


Scores on the Strong Vocational Interest Blank and the Kuder Preference Record in 
Relation to Self Rating. R. F. Berdie 


Visual Differentiation of Moving Objects. N.C. Kephart and G. G. Besnard........ 
An Analysis of Visual Requirements in Industry. £. J. McCormick 


The Effect of Ordinal Position upon Responses to Items in a Check List. D. T. Camp- 
bell and P. J. Mohr 


Identification of Cola Beverages. IV. Postscript. N.H.PronkoandD.T.Herman... 68 
Book Reviews 
New Books, Monographs, and Pamphlets 





American Psychological Association 


Vol. 34, No. 1 February, 1950 





Journal of Applied Psychology 


Published Bi-monthly by the American Psychological Association, Inc. 
Prince and Lemon Sts., Lancaster, Pa. 


Annual subscription, $6.00; single copies, $1.25 


Subscriptions and business communications should be sent to 
American Psychological Association 
1515 Massachusetts Avenue N.W. 
Washington 5, D. C. 


Articles for publication and books for review should be sent to the Ediior 


Professor Donald G. Paterson, Department of Psychology 
University of Minnesota, Minneapolis 14, Minnesota 





This Journal gives prompt consideration to 
manuscripts reporting original investigations in 
any field of applied psychology except clinical 
and consulting psychology. A descriptive or 
theoretical article. is occasionally accepted if it 
deals in a distinctive manner with a problem of 
applied psychology. ‘The policy is, however, to 
favor papers dealing with quantitative investi- 
gations of direct value to psychologists working 
in the following fields: Vocational diagnosis and 
occupational guidance; educational diagnosis, 
prediction and guidance at the secondary school 
level and higher; personnel selection, training, 
placement, transfer and promotion in business, 
industry and government service including the 
armed forces; supervisory training in business, 
industry and government; bio-mechanics or de- 
sign of machines to fit the human operator; il- 
lumination, ventilation and fatigue in industry; 
job analysis, description, classification and eval- 
uation; measurement of morale of executives, 
supervisors, or employees; surveys of opinion on 
social or political issues, such as those conducted 
by The Psychological Corporation ; psychological 
problems in market research and in advertising. 


Articles may be under 500 words. The maxi- 
mum is 12,000 words, the average in the 


neighborhood of 4,000 words. To reduce lag of 
publication, adherence to the rule of “brevity 
consistent with clarity” is encouraged. 


A lapse of six to twelve months occurs between 
acceptance of an article and its publication, the 
lag varying with the rate at which manuscripts 
are submitted. If, however, an author is pre- 
pared to defray the costs of printing the neces- 
sary extra pages, he may arrange for earlier 
publication without thereby postponing the ap- 
pearance of manuscripts by other contributors. 
This enables the management to provide space in 
addition to the scheduled 64 pages per issue. 
“Early publication” is thus a direct contribution 
to the subscribers. By cutting down lag in pub- 
lication, it also benefits those authors whose 
articles are published in regular turn. 


Tables, footnotes and references as well as 
text of manuscripts should be typed double-spaced 
throughout. Authors should adhere to the con- 
ventions described by J. E. Anderson and W. 
L. Valentine in “The preparation of articles for 
publication in the journals of the American 
Psychological Association,” Psychol. Bull., 1944, 
41, 345-376. A reprint of this article will be 
loaned to any prospective contributor who does 
not find it in his library. 


Entered as second-class matter, August 19, 1943, at the post office at Lancaster, Pa., under the uct of March 3, 1879 


Acceptance for mailing at the special rate of postage provided for in paragraph (d-2), Section 34.40, 
P. L. & R. of 1948, authorized October 10, 1947 


Copyright, 1950, by The American Psychological Association, Inc. 





Journal of Applied Psychology 








VoL. 34, No. 1 


FEBRUARY, 1950 








Menstruation and Industrial Efficiency. 


I. Absenteeism 


and Activity Level * 
Anthony J. Smith 


University of Kansas 


A survey of the experimental work directly 
related to the influence of menstruation upon 
industrial efficiency reveals that investigators 
have been primarily concerned with absentee- 
ism, with much less emphasis placed upon pro- 
duction. Studies indirectly related to the prob- 
lem have dealt with such things as subjective 
changes and performances of various types, 
such as learning, steadiness, satiation, etc.' 

Many of these reports are to be criticized on 
several grounds. Some of the studies yielded 
generalizations based on pitifully small num- 
bers of individuals. Social status and age 
were often disregarded. The suitability of the 


subjects, with reference to physical condition, 
was often given less consideration than the 
availability of the subjects. Analyses often 
contrasted menstrual with nonmenstrual data, 
rather than comparing the finer components of 


the menstrual cycle. Fluctuations occurring 
during menstruation were sometimes overem- 
phasized, whereas other equally great fluctua- 
tions were ignored. ‘The factor of suggestion 
was occasionally uncontrolled. Preconceived 
personal opinions at times seem to have been 
more powerful in determining conclusions than 
the actual data obtained. Statistical treat- 
ment of the data was inadequate in some 
studies, almost completely missing in others, 
with the result that the basis for inference was 
absent or uncomfortably shaky. 

In spite of these criticisms, some facts seem 


* This material is derived from the author's disserta- 
tion submitted in April, 1945 in partial satisfaction of 
the requirements for the Ph.D. degree at the University 
of California, Los Angeles. The author is deeply 
indebted to Dr. Roy M. Dorcus for his guidance and 
criticism. 

‘For an excellent review of this material see the 
article by Georgene H. Seward (6) 


Roh (He ETE Hm 


to stand out. Menstrual absences, despite 
occasional conflicting reports, would appear to 
be, in general, rather low and of much less im- 
portance to an employer than many other 
factors contributing to absenteeism. The 
numerous subjective changes, however (de- 
pression, increased emotionality, anxiety, fa- 
tigue, irritability, cramps, headache, etc.), 
sometimes classified as ‘“‘pre-menstrual ten- 
sion,” are reported by subjects with great fre- 
quency. It would not be illogical to hypoth- 
esize that these changes would be reflected in 
the efficiency of a woman worker. 


Procedure and Analytic Techniques 


In the present investigation, workers in the 
electrical department of an aircraft factory and 
and in two separate garment companies were 
studied. The criteria of industrial efficiency 
were absence rate, activity level, quality of 
production and quantity of production. The 
first two will be discussed in this paper. 


Aircraft Factory. Those women who served as sub- 
jects in the aircraft factory were working at various. 
tasks such as assembling electrical equipment, soldering, 
and constructing wiring circuits. They were paid a 
straight wage with provisions for higher overtime rates. 
The full-time secretary of the group kept accurate 
records of all leaves of absence, transfers, absences and 
tardinesses, and these records were made available for 
analysis. In addition to the use of absence rate as a 
criterion of efficiency, it was decided to examine a 
characteristic of behavior often commented on but little 
studied, i.e., activity level. It has been widely assumed 
that a woman in her menstrual period exhibits a lower- 
ing in general activity, an idea probably stemming from 
an unsubstantiated belief that a considerable number of 
women are forced to retire from general activity during 
this “sick” period. The lowering in activity level is 
presumably of such dimensions as to be readily observ- 
able. If this were true, such a change should be appar- 

















2 Anthony J. Smith 


ent to someone who has been specifically instructed to 
look for it. It was evident that the leadmen in the 
group were in most intimate contact with the women 
and would be best qualified to make judgments con- 
cerning their activity levels. Consequently, they were 
required to turn in daily ratings on a form requesting 
the checking at the end of the day of one of the follow- 
ing: (a) Very energetic and industrious; (b) Fairly 
industrious, quite active; (c) Works in a slow, leisurely 
manner; and (d) Relatively inactive, works very slowly. 
The leadmen, of course, were not acquainted with the 
purpose of the study. 

Menstrual! data were collected through the coopera- 
tion of the personnel and physical education depart- 
ments. It was absolutely necessary that none of the 
women participating recognize the true nature of the 
experiment for the results might then more closely 
reflect the anticipated effects of menstruation than its 
actual effects. To accomplish this, the experiment was 
described as being concerned only with the duration of 
the menstrual cycle and of the menstrual phase of the 
cycle. : 

A group of thirty-eight women between the ages of 
17 and 44 had been selected on the basis of age (an 
arbitrary limit of 45 being set) and the availability of 
daily inspection records. These women were then 
assembled on company time and the prepared state- 
ment of the “purpose” was read by a member of the 
personnel staff. It was emphasized that participation 
would be deeply appreciated but that no one should fail 
to assert herself should she feel disinclined to supply the 
information. All offered to cooperate. They were 
then informed that a daily contact would be made by 
a member of the physical education department to 
ascertain the phase of the menstrual cycle. The women 
were only required to inform the person whether or not 
they were menstruating. 

Information on absences, ratings of activity level, and 
menstruation was gathered for a period of forty-one 
days, during August and September, 1943. By the end 
of this period the number of women about whom suffi- 
cient information had been gathered to permit an 
analysis had been reduced to twenty-nine. 

For purposes of analysis, the menstrual cycle was 
broken down in two ways. First, a simple menstrual 
(period of flow)-nonmenstrual division was made. 
Second, the cycle was broken down into a five day 
premenstrual period, a menstrual (bleeding) period, a 
seven day postmenstrual period, and an intermenstrual 
period. 

The effect of the menstrual cycle upon absence rate 
was examined by determining the number of absences 
and number of days in attendance for all women for 
each of the four phases of the cycle. These data were 
then recorded in a two-by-four table and the signifi- 
cance of the variations was tested by means of the chi- 
square test. The nonmenstrual days were then com- 
bined in one group and the menstrual-nonmenstrual 
analysis was performed. 

A group analysis of activity level was made by 
analyzing the ratings of the leadmen, again using a 
chi-square technique. Because of the low frequency 


with which rank four was assigned, it was necessary to 
combine ranks three and four. 

Parachule Factory. The second study was carried 
out in a local garment factory in which parachutes were 
being made. A survey of the plant revealed that it 
provided an ideal set-up for such an investigation as 
this. An overwhelming majority of the employees 
were women. There was a great diversity in the types 
of work performed in the plant, permitting a wide 
sampling of tasks, and yet, except on rare occasions, the 
women continued to work at the same jobs. Moreover, 
three shifts were available for study. Finally, a piece 
rate system of payment was in use, with the result that 
excellent records were kept by the management. 

Forty-six women were selected for investigation so 
that there would be the most equitable distribution of 
women and jobs with reference to shift, age of em- 
ployee, mental difficulty of work, physical difficulty of 
work, and the necessity for standing or sitting continu- 
ously on the job. These women were contacted and 
agreed to cooperate. 

In addition to the analysis of the entire group, sep- 
arate analyses were also made for each of the following 
thirteen groupings: three shifts, three age groups, three 
levels of mental difficulty, two levels of physical diffi- 
culty, and standing vs. sitting jobs. 

Garment Factory. The last part of this investigation 
was carried out in a garment factory and was made 
possible through the cooperation of the personnel de- 
partment and the local labor union. 

The union gathered data on daily earnings, pre- 
sumably to obtain information about salaries that 
would permit of a more enlightened discussion at union- 
management arbitrations. These data were recorded 
on mimeographed forms covering weekly periods, from 
which daily attendance records could then be derived. 

The workers were contacted by the author’s wife and 
information concerning menstrual cycles was requested 
with the same apparent “purpose”’ being quoted. At 
no time was any connection made between menstrual 
records and earning records. 

Absences were then analyzed in the manner pre- 
viously described. 


Results 


Activity Level. Results from the investiga- 
tion of activity level are presented in Table 1. 
The analyses of these data indicate that lead- 
men cannot detect any reliable change in rate 
of working during the menstrual cycle. When 
the four phases of the menstrual cycle are ex- 
amined, it is found that somewhat lower ratings 
were given during the menstrual and premen- 
strual phases. However, the probability value 
corresponding to the obtained chi-square was 
10. When the menstrual and nonmenstrual 
components were contrasted, the probability 
value was .40. 

It would seem justifiable to conclude that 
if real differences in activity level among the 





Menstruation and Industrial Efficiency. 1 3 


Table 1 


Activity Level: Frequency of Assignment of Various 
Ratings During the Phases of the 
Menstrual Cycle 








Post- Inter- 
menstrual menstrual 
1 50 56 73 130 

(56.92)* (61.56) (74.34) (116,16) 





Pre- Men- 
Rating menstrual  strual 








66 70 95 121 
(64.83) (70.12) (84.67) (132.30) 


31 33 24 49 
(25.24) (27.30) (32.97) ($1.51) 


* Values in parentheses refer to expected frequencies. 


various phases of the menstrual cycle actually 
exist, they would appear to be of such small 
magnitude as to warrant little or no considera- 
tion in this industrial situation. 

Absences. The criterion of efficiency most 
adequately studied during this investigation 
was the absence rate of the worker. A total of 
91 women acted as subjects and absence records 
for approximately 3800 potential working days 
were examined (man-days). The data for the 
workers in each of the three factories are pre- 
sented in Table 2. It should be emphasized 
that menstrual absences are merely those that 
have occurred during the menstrual period and 
could conceivably be attributed to dysmen- 
orrhea, colds, injury, personal business, indif- 
ference, etc. If it had been possible to secure 
adequate data on causes of absences, it seems 
likely that the results would have been in close 
agreement with Gafafer’s work, in which it was 
discovered that absences attributable to dys- 
menorrhea were responsible for an average 
loss of only .29 days per person per year (2). 

Information gathered at the aircraft factory 
revealed that menstrual absences occurred less 
frequently than might have been expected on 
the basis of chance, while premenstrual ab- 
sences were more frequent. However, this 
trend was not significant (P> .30). 

The more exhaustive study of the total 
group at the parachute factory yielded low 
menstrual absence rates and high postmen- 
strual absence rates. In the sub-groups no 
significant differences were found in the men- 
strual-nonmenstrual analyses. However, in 


the examination of the four component phases 
of the menstrual cycle, significance was en- 
countered in three instances. The women who 
worked on the second shift showed relatively 
high postmenstrual and low premenstrual ab- 
sence rates (P=.05). Women in the age range 
29-38 also manifested high postmenstrual 
and low menstrual absence rates (P= <.01). 
Finally, the women in the age range 39-50 had 
high premenstrual and low menstrual absence 
rates (P=.02). 

In the first two instances (second shift, ages 
29-38) excessive absences do not occur during 
those parts of the menstrual cycle that have 
been presumed to be most often characterized 
by the presence of unpleasant symptoms or 
disabling effects. The reasons for the appear- 
ance of high postmenstrual absence rate in 
these two cases are unknown but it is possible 
to speculate that it might be a function of the 
following factors: a tendency toward decreased 
social activity during the menstrual period, a 
decrease in the frequency of sexual relations 
during the period of flow (4), an increase in the 
frequency of intercourse during the postmen- 


Table 2 
Attendance Records for Total Groups 
Post- Inter 


menstrual menstrual! 


Men 
strual 


Pre- 
menstrual 
Aircraft 
Factory 
Absent 1! 5 12 14 
(7.89)* (15.83) 


Working 150 
(153.11) 


309 
(307.17) 


Parachute 

Factory 

Absent 21 24 40 66 
(22.23)  - (26.18) (32.30) (70.39) 


1039 
(1034.61) 


Working 328 387 467 
(326.77) (384.82) (474.70) 


Garment 

Factory 

Absent 1 il x il 
(5.29) 6.43) (5,80) (13.55) 


Working 83 91 84 204 
(78.71) (95.57) (86.20) (201.45) 


* Values in parentheses refer to expected frequencies. 











4 Anthony J. Smith 


strual period (4), an increase in the strength 
of the sexual drive during the postmenstrual 
period (1, 3), and a greater proportion of men 
working on the first and third shifts combined 
than on the second shift. 

For the final sub-group yielding significance 
(ages 39-50) it might seem at first glance that 
earlier conceptions are substantiated. How- 
ever, if the intermenstruum is accepted as a 
“normal” basis for comparison, this is not the 
case. The low menstrual absence rate offsets 
the high premenstrual rate. While the earlier 
conception is contraindicated, it would seem to 
be advisable to examine the high premenstrual 
rate for this older group in greater detail. 

At the garment factory results were en- 
countered that were significant at the 5 percent 
level. Here menstrual absences were high, 
premenstrual absences low. 

There are several differences that exist be- 
tween the garment factory and the parachute 
and aircraft factories that might be responsible 
for the obtained differences in absence rates. 

In the war industries there was the possi- 
bility that patriotic motives might play a part 
in determining beth group attitudes toward 
absenteeism and the likelihood that the indi- 
vidual would consider that slight discomforts 
would justify staying home. 

The fact that the loss of a day would elimi- 
nate overtime payments for that week might 
cause many ailments to be overlooked, particu- 
larly those known to be nonprogressive and of 
relatively short duration. In the garment 
factory a straight piece rate system was em- 
ployed, so absence caused no disproportionate 
loss. 

Wages in general were appreciably higher 
in the war plants so that many of the better 
workers had left the garment factory, and those 
that remained might have been less conscien- 
tious concerning attendance, particularly dur- 
ing a period of labor shortage. 

Differences may have existed in the cultural 
backgrounds of the subjects. The author does 
not have sufficient information available to 
argue that the garment workers come from a 
lower socio-economic level or that they are 
more likely to believe that menstruation has 
disabling effects. But in the light of the work 
of Sowton and Myers (7) in which it was shown 
that the detrimental effects of menstruation 


tended to be greater among subjects from the 
lower socio-economic group, it would seem to 
be worthwhile to carry out further, more de- 
tailed investigations. 

Finally, facilities for overcoming dysmen- 
orrhea were more adequate in the parachute 
and airplane factories. The parachute factory 
hac a nurse in attendance who treated the em- 
ployees and gave suggestions for the relief of 
any discomforts. Employees could come into 
the dispensary to rest and obtain help whenever 
necessary. In the aircraft factory this same 
service was available. Furthermore, groups of 
women met and were given special instructions 
in matters of hygiene and exercises that would 
be beneficial in relieving fatigue, menstrual! 
discomforts, etc. This process might perhaps 
work both through direct physiological changes, 
and through suggestion. 

It should also be recalled that should no 
real differences exist in absence rate; approxi- 
mately five samples out of each hundred will 
yield significant differences.. Whether or not 
such an explanation might be defensible here, 
it is, of course, impossible to say. 


Summary 


This phase of the investigation was con- 
cerned with the relationship between the com- 
ponents of the menstrual cycle and two meas- 
ures of industrial efficiency: activity level and 
absence rate. The subjects were 96 women in 
three separate factories. 

The data indicate that: 

1. There is no discernible change in activity 
level related to menstrual function. 

2. Significant differences in absence rates 
among the phases of the menstrual cycle are 
encountered in three subgroups at the para- 
chute factory and at the garment factory. High 
rates of absence are not characteristically found 
in any one phase of the cycle. 


Received A pril 19, 1949. 


References 


1. Davis, Katherine B. Factors in the sex life of twenty- 
two hundred women. New York: Harpers, 1929. 

2. Gafafer, W. M. (Ed.) Manual of industrial hygiene 
and medical service in war industries. Phila- 
delphia: W. B. Saunders Co., 1943. 





Menstruation and Industrial Efficiency. I 5 


3. Howell, W. H. Textbook of physiology. (14th Ed.) menstrual cycle on women workers. Psychol. 
Philadelphia: W. B. Saunders Co., 1941. Bull., 1944, 41, 90-102. 

4. McCance, R. A., Luff, M. C., and Widdowson, E,E. 7. Sowton, S. C. M., and Myers, C. S. (I) and Bedale, 
Physical and emotional periodicity in women. E. M. (ID. Two contributions to the experi- 

re mental study of the menstrual cycle. IL. Its 

J. Hyg., Camb., 1937, 37, 571-611. : r 

5. Rid i! § ae influence on mental and muscular efficiency. 

- Rider, P. R. An introduction f maton statistical II. Its relation to general functional efficiency. 
methods. New York: John Wiley, 1939. Rep. industr. Fatigue Res. Board, No. 45, London, 

6. Seward, Georgene H. Psychological effects of the 1928. 


rece gH prune ney Fee B ao Rd w a. 











Validity of Work Histories Obtained by Interview * 
Elizabeth Keating, Donald G. Paterson, and C. Harold Stone 


Industrial Relations Center, University of Minnesota 


The work of the employment interviewer, 
the vocational counselor, and the research 
worker in the field of labor mobility is de- 
pendent in large measure upon the validity of 
work histories obtained by interview. In view 
of the urgency of the problem and in view 
of the extensive literature on the interview, it 
is surprising to discover that so little evidence 
on the problem of validity has been reported. 


Review of Literature 


Search of the literature reveals only one 
study dealing directly with the problem of 
validity of work histories obtained by inter- 
view and in which tabular data are presented 
Creamer and Coulter (7) in reporting a survey 
of unemployed mill workers in Manchester, 
New Hampshire, gave data on validity by 
comparing work history interview information 
with employer records. 

The fact that only one study of this specific 
problem giving quantitative evidence has been 
unearthed is most surprising in view of the 
voluminous literature on personnel administra- 
tion, personnel psychology, industrial relations, 
and labor market mobility. Furthermore, 
when one stops to realize the millions of situa- 
tions in which routine employment practice 
requires “recording of work history of appli- 
cants and checking with previous employers” 
it is amazing to find no reports concerning the 
validity of such work histories. Books on 
personnel management such as those by Tead 
and Metcalf (23), Shefferman (21), Scott, 
Clothier et al. (20), Baridon and Loomis (2), 
Knowles and Thompson (11), Walters (27), 
Yoder (28), Pigors and Myers (18) and Jucius 


* The original data were secured by the Employment 
Stabilization Research Institute of the University of 


Minnesota in 1940-41. Principal findings of this study 
of the unemployed are reported in D. Yoder, D. G. 
Paterson, H. G. Heneman, C. H. Stone, et al. Local 
labor market research, Minneapolis: University of 
Minnesota Press, 1948. The present report was pre- 
pared with the assistance of Dr. Dale Yoder, Director 
of Industrial Relations Center, and Dr. H. G. Heneman, 
Jr., Assistant Director , 


(10) have been searched in vain for cited 
evidence. Books on personnel psychology 
such as those by Link (13), Hollingworth (8), 
Viteles (26), Laird (12), Jenkins (9), Burtt (5), 
Maier (14) and Tiffin (24) are likewise unre- 
warding so far as evidence on this specific prob- 
lem is concerned. Emphasis in these books is 
placed on the unreliability and low validity of 
the interview as a selection and placement 
device. Jenkins (9) and Laird (12), however, 
do mention validity of- work histories obtained 
by interview and in both instances the im- 
pression is given that validity is low, but again 
evidence is not cited. 

The most valuable treatment of the inter- 
view in the literature is that by Bingham and 
Moore (3). They present an excellent dis- 
cussion of the problem of validity but the 
specific evidence cited concerns only their own 
study showing a high degree of validity of 
verifiable aspects of interviews with employees 
concerning employment guarantee plans and 
their attitudes and opinions toward such plans. 

The literature on labor market mobility 
and studies of the unemployed in the labor 
market is likewise singularly free from detailed 
evidence concerning validity of work histories. 
Bowley and Burnett-Hurst (4) recognized the 
importance of the problem, collected pertinent 
data but failed to present the details. They 
merely stated: “Wage statements were checked 
in a considerable number of cases by definite 
facts from the employers, and the results show 
that there was no evident bias in the direction 
of overstatement or understatement, though 
there were mistakes” (4, p. 174). Bancroft (1) 
skirts the fringes of the problem by reporting 
a check on the accuracy with which unem- 
ployed workers on relief provide interviewers 
with statements concerning relief grants and 
date of latest registration at public employment 
offices. Palmer (17) is concerned with the 
reliability and consistency of responses given 
in labor market inquiries but cites no evidence 
on validity determined by checking with rec- 





Validity of Work Histories Obtained by Interview 


ords of previous employers. Even the ex- 
cellent studies by the Yale researchers as 
reported by Reynolds and Shister (19) and by 
Noland and Bakke (16) fail to cite any evidence 
in regard to the validity of the work histories 
obtained in their extensive series of field 
interviews. 

Clague, Couper and Bakke (6) tested the 
accuracy of workers’ statements about prior 
jobs against employer records but failed to 
present quantitative evidence. Instead, they 
state: “‘As a result of these tests we have come 
to the following conclusions. A vast majority 
of the workers answer as accurately as they are 
able to do so. Only a very small fraction of 
the schedules gave evidence of attempts to 
mislead the interviewers . . . The replies with 
respect to wage rates were on the whole 
excellent . . . On time intervals, however, he 
is much less sure of himself . . . On the whole 
the data as presented in this study are accurate, 
and if the necessary corrections could be made 
the results would not be materially changed” 
(6, 129). 

Myers and Maclaurin (15) report that, in 
addition to information secured directly from 
company records, they interviewed a sample 
of 233 workers and checked their answers 
against company records. Again, detailed 
tabular evidence is not given but statements 
are made which seem to imply low validity of 
work history reports. For example, “More 
than a third did not mention jobs which they 
were known to have had . . . It was seldom 
that a job held as long as six months was not 
mentioned by the worker. For this reason, 
we are inclined to attribute these omissions to 
‘poor memory.’ . . . There were errors in esti- 
mating the length of particular jobs, and 
workers tended to overestimate rather than 
underestimate the lengths of the jobs they had 
held in the preceding three years. . . . . A fre- 
quent discrepancy was found between the size 
of the weekly paycheck which the worker 
claimed to have earned and the amount which 
his employment records showed he had actually 
earned. In those cases where a comparison 
was possible, more than five times as many 
workers overestimated their earnings as under- 
estimated them. Usually the difference was 
small, but the tendency was nonetheless im- 


portant. . . . One conclusion that stands out 
from this analysis of discrepancies is that 
workers’ memories or statements cannot be 
relied upon for detailed factual information 
concerning their work experience. They are 
frequently unable to remember all the jobs 
they have had, the order in which they had 
them, the dates of employment, the length of 
their jobs, and their earnings.” 

Creamer and Coulter’s report (7) was a 
WPA study of the accuracy of work histories 
given by unemployed textile mill workers, 
90 per cent of whom were foreign born. Re- 
cords of 227 persons interviewed in their homes 
were compared with employer's records for the 
period 1930-1934. Results were reported by 
four tenure groups: 


1. Those reporting one job with continuous 
employment during the 5 year period studied. 

2. Those reporting one job but with one or 
two periods of no employment. 

3. Those reporting more than one job with 
some periods of no employment. 

4. Those reporting no employment during 
the five year period. 


Results showed that only 14.1 per cent of 
the 227 cases reported their duration of employ- 
ment with “complete accuracy.” More than 
two-thirds of the cases involved over-statement 
of the length of employment. The conclusion 
is drawn that “this evidence suggests that in 
any industry characterized by intermittent em- 
ployment, a work history of the fairly recent 
past based on the memory of the worker will 
tend to minimize the degree of intermittency 
by understating the number of jobs and by 
overstating the total duration of employment 
by appreciable amounts” (5, p. 342). 

It is difficult to evaluate the Creamer and 
Coulter study in a straightforward manner. It 
is based upon home interviews of 227 former 
employees of a textile mill. The interviews 
apparently took place in November 1936 and 
covered prior employment with the Amoskeag 
Textile Mill extending backward in time from 
December 1934. The qualifications and train- 
ing of the interviewers are not stated. Fur- 
thermore, employment in this particular com- 
pany was characterized by “intermittency” so 
that it is little wonder that former employees 














8 Elizabeth Keating, Donald G. Paterson, and C. Harold Stone 


who were largely semi-skilled and foreign born 
were hazy and inaccurate in recalling the 
number of jobs held in the company and the 
duration of unemployment during each of the 
five years preceding December 1934. A fur- 
ther difficulty is that the results are presented 
in statistical tables which provide only totals 
and averages. These are difficult if not im- 
possible to understand, let alone interpret. 
While the general impression of the low validity 
of work history data as given is probably cor- 
rect, one should keep in mind that no precise 
index of validity is available and that the data 
themselves are provided by interviewers and 
interviewees under conditions that would be 
expected to promote a maximum of memory 
error. 

In summary, the reports by Bowley and 
Burnett-Hurst and by Clague, Couper and 
Bakke give an impression of high validity 
whereas the reports by Myers and Maclaurin 
and by Creamer and Coulter give an impression 
of low validity. It is obvious that the im- 
portance of the problem in social science re- 
search and in personnel practice is such as to 
warrant citation of detailed evidence in such a 
manner that no one is forced to rely on author’s 
conclusions but can interpret the evidence 
directly. There is urgent need for this kind of 
evidence. It is the aim of the present report 
not only to give this kind of evidence but also 
to give it in a form that is clear and un- 
ambiguous. 


Procedure in the Present Study 


The research reported in this paper is 
based on data gathered in studies of the oc- 
cupational competence of unemployed persons 
in St. Paul conducted by the Employment 
Stabilization Research Institute in 1940-1942 
(29). A random sample of unemployed per- 
sons who registered for employment at the St. 
Paul office of the USES during the period 
September, 1941, through February, 1942, was 
studied.' 

Extensive personal and occupational his- 
tories were secured from the registrants by 
interviewers on the staff of the Research Insti- 


‘For further detailed discussion of the sample, see 
29, p. 129 ff. 


tute.? Specific information was obtained con- 
cerning the nature of job duties, dates of em- 
ployment and separation, wage rates, reasons 
for separation, names of previous employers, 
etc., for every job held by each registrant since 
his entry into the labor force. Interviews 
were conducted in a clinical situation as part 
of a comprehensive program of individual case 
analysis. Those interviewed were unaware 
that their statements were to be subjected to 
independent verification. However, each per- 
son’s full cooperation was obtained by inform- 
ing him that the results of the interviewing 
and testing would be used not only for research 
but more importantly for counseling to enable 
the person to obtain the type of job and/or 
training for which he appeared to be fitted. 

Verification. and amplification of inter- 
viewees’ statements were obtained by personal 
interviews with’ employers in the Twin City 
area by an “employer contact man” on the 
staff of the ESRI? Information was also ob- 
tained by mail from firms outside the metro- 
politan area. Using Toops’ “follow-up” tech- 
nique (25), an 80 per cent return was obtained 
on mailed inquiries. 

Verification of employees’ statements 
against company records calls for recognition 
of the possibility of error in company records. 
However, employer contacts in the study were 
made with great care and wherever a possibility 
of inaccuracy was noted by the employer con- 
tact man, this fact was stated in the case 
history report. Data from employers thus 
noted as questionable were not used. In this 
study, consequently, discrepancies between the 
statements of former employees and the em- 
ployer are assumed to be the result of in- 
accurate information provided by the former 
employee. However, errors on the part of the 
interviewers in securing and recording work 
histories may have been involved. 

Complete case histories for 385 unemployed 


*? These interviewers were first-year graduate stu- 
dents specializing in personnel psychology who had had 
one or more courses dealing at least in part with the 
general principles of interviewing. Before actual inter- 
viewing in the employment office began, each was given 
a minimum of one day of detailed instruction and super- 
vised practice in conducting the type of interviewing 
required in this study. Close supervision was con- 
tinued throughout the investigation. 

* Henry Morgan of the Research Staff was assigned 
the responsibility of securing employer reports. 





Validity of Work Histories Obtained by Interview 


were available. Case histories for 236 reg- 
istrants (157 men, 79 women) contained suffi- 
cient information to permit their inclusion in 
the present study. Employer reports on 373 
jobs held provided information in one or more 
of the areas studied i.e., duration of job, weekly 
wage received, and nature of duties performed. 

Because memory distortion would be ex- 
pected to operate differentially in terms of the 
time these 373 previous jobs were held, it was 
necessary to classify each by a time interval 
category. Thus, a 0-12 months category in- 
cluded all jobs terminated within one year 
prior to the. interview. The 13-24 months 
category included all jobs terminated from 13 
months to 2 years prior to the interview, and 
soon. This breakdown was designed to reveal 
differences in validity of the occupational 
history between the immediate and more dis- 
tant past. : 

Case histories of men and women were kept 
separate to facilitate study of sex differences 
in validity of job reports. Scatter diagrams 
for each time-interval were plotted to show 
the relation of the employer contact report to 
the interview record with respect to: 1. Weekly 
wages (male and female); and 2. Duration of 
job (male and female). 


Results for One Year Time Interval 


Wages. Table 1 shows the relationship be- 
tween reported weekly wages and verified 
weekly wages for male workers on jobs held 
within one year of time of interview. Table 
2 presents the same data for the women 
workers. Validity coefficients of +.90 and 
+.93 (based on ungrouped data) may be re- 
garded as justifying the acceptance of wage 
data reports obtained under the circumstances 
surrounding this investigation with consider- 
able faith. Furthermore, neither men nor 
women showed any consistent tendency to 
over-state or to under-state their wages. 

Length (Duration) of Job. “Distribution of 
data for jobs on which information was 
available on job duration is given in Tables 3 
and 4. Again, the agreement between worker 
reports and employer records is close as re- 
vealed by validity coefficients of 4-.98. 

‘These cases range from “professional” to “un- 


skilled.” See Local labor market research (29, 140-141) 
for distribution according to primary job classifications 


ASS es OS TT hs SSE a 


Table 1 


Relationship between Reported Weekly Wages and 
Verified Weekly Wages for 116 Male Workers 
on Jobs Held 0-12 Months Prior to 
Interview (1940-1941) 


Note: Pearson r for ungrouped data = +-.90 


i 


Verified Weekly Wages / 


Reported | 
Weekly | 
Wages 


Linon bee :™ 
0-9 |10-19/ 20 29/30. 9 40-49] s0-s9 0-69) 


60-69 1 ] 
50-59 
40-49 
30-39 
20-29 
10-19 2 
oo TF 









































Total | 6 
| 





* One case not shown: reported weekly wage, $105; 
verified weekly wage, $100. 


Table 2 
Relationship between Reported Weekly Wages and 
Verified Weekly Wages for 61 Female Workers 
on Jobs Held 0-12 Months Prior to 
Interview (1940-1941) 
Note: Pearson r for ungrouped data = +.93. 


Rencreed Verified Weckly Wage 


Weekly |—-—~—_-- —— 


: th - Total 
10-14 1s-19| 20-24! 





: } | 
20-24 3 
15-19 
10-14 
5-9 


0-4 


Total 









































Duties of Job. In the 0-12 months time 
interval category, there were 144 male cases 
which contained data sufficient for study. 
Agreement between the interviewee’s state- 
ment and the employer report of job duties 
was found in 94 per cent of the cases. There 
was a 96 per cent agreement for the 68 women. 
The nature of the data are such as to preclude 
the presentation of “scatter tables” or the 
computation of correlation coefficients. 

In the small percentage of cases where the 
duties of the job as stated by the interviewee 

















Elizabeth Keating, Donald G. Paterson, and C. Harold Stone 


Table 3 
Relationship between Reported and Verified Duration 
of Employment in Months for 127 Male 
Workers on Jobs Held 0-12 Months 
Prior to Interview 
Note: Pearson r for ungrouped data = + .98 








Verified Duration (Months) 

7 —-————-| Total 

20-39) 40-59 00-79, 80-99 
aoe 














} 2 
































* One case not shown: reported duration, 192 months; 
verified duration, 192 months. 


‘Table 4 
Relationship between Reported and Verified. Duration 
of Employment in Months for 64 Female 
Workers on Jobs Held 0-12 Months 
Prior to Interview 
Note: Pearson 7 for ungrouped data = +-.98. 


Verified Duration (Months) 
Reported 
Duration |-— 
(Months) 


| Total 
| O19 20-89} 40-59) 60-79) 80-99) 100-119) 
100-119 | 1 | 
80-99 | | 1 | 
6-79 | 
40-59 | l 
20-39 | 
0-19 


























1 | 5 





Total J K 2 


64 





eaet nese Senet | aa eee 
L 


did not agree with the employer contact re- 
ports, there was a tendency to inflate the level 
of skill and responsibility in their jobs. 


Results for the Longer Time Intervals 


Because the number of cases involved in 
reports for time intervals greater than one year 
is small the results are presented in terms of 
correlation coefficients only. 

Wages. Table 5 indicates that the correla- 
tions are positive and high for both men and 
women workers regardless of time interval. In 
the case of the men, the time interval extends 


to six years. Of course, the numbers of cases 
is so small that the correlation coefficients 
must be accepted only with great reservations. 
The most that can be said is that the evidence 
does not indicate any definite drop in validity 
with the passage of time. 

Duration of Employment. Table 6 also 
indicates that duration of employment is re- 
ported with a surprisingly high degree of 
validity even though time intervals from one 
to six years are involved. Of course, the 
numbers of cases involved are small and the 
correlations are not to be accepted at face 
value. Nevertheless, there is no evidence of 
decreasing validity with lapse of time. 

Duties of Job. Data on reports of duties 
of the job, held at time intervals prior to inter- 
view greater than one year, were found to be 
surprisingly accurate. It is only the occasional 
applicant for employment who distorts his 


Table 5 


Correlations between Reported Weekly Wages and 
Verified Weekly Wages for Male and Female 
Workers on Jobs Held at Time Inter- 
vals Greater Than One Year 
Time Interval 
(Job Termina- 
tion Date Prior -— 
toInterview) Number r 
13-24 Months 24 95 
25-36 Months 10 94 
37-48 Months 19 62 
49-60 Months 5 9 
61-72 Months 8 76 


Men 


Women 


Table 6 


Correlations between Reported and Verified Duration of 
Employment for Male and Female Workers 
on Jobs Held at Time Intervals Greater 
than One Year 





"Time Interval 


(Job Termina- 
tion Date Prior 

to Interview) 
13-24 Months 
25-36 Months 
37-48 Months 
49-60 Months 
61-72 Months 





Validity of Work Histories Obtained by Interview 11 


previous work history under the conditions 
surrounding the present investigation. 


Summary 


In the extensive literature on the interview 
surprisingly little evidence exists with respect 
to the validity of work histories obtained by 
interview. Only one study was found bearing 
directly on the problem and this study was 
seriously defective. 

The present study of applicants for employ- 
ment in the St. Paul Office of USES was con- 
ducted on a research basis in a setting in which 
occupational guidance was stressed. Pre- 
sumably little incentive existed for these appli- 
cants to distort their work histories. The 
validity of the work histories when checked by 
employer reports was found to be surprisingly 
high with respect to weekly wages, duration of 
employment and job duties. Validity re- 
mained high for histories secured for jobs held 
up to six years prior to the interview as well as 
for jobs held just prior to the interview. In 
terms of correlaticn coefficients, the validities 
may be generalized as being from +-.90 to +.98. 


Received October 17, 1949. 
Early publication. 


References 


. Bancroft, G. Consistency of information from 
records and interviews. J. Amer. statist. Ass., 
1940, 35, 377-381. 

. Baridon, Felix E., and Loomis, E. H. Personnel 
problems. New York: McGraw-Hill Book Co., 
1931. Pp. 25-29. 

. Bingham, W. V., and Moore, B. V. How to inter- 
view. New York: Harper and Brothers, 1934 
(rev. ed.). Ch. 12, pp. 190-216. (The 1942 
revision omits the report of their validity study.) 

. Bowley, A. L., and Burnett-Hurst, A. R. Léveli- 
hood and poverty. London, England: G. Bell and 
Sons, Ltd., 1915. Pp. 222. 

. Burtt, H. E. Principles of employment psychology. 
New York: Harper and Brothers, Rev. Ed., 
1942. Pp. 405-475. 

. Clague, E., Couper, W. J.,and Bakke, E.W. After 
the shutdown. New Haven: Institute of Human 
Relations, Yale University, 1934. 

. Creamer, D., and Coulter, C. W. - Labor end the 
shut-down of the Amoskeag Textile Mills. Phila- 
delphia: Work Projects Administration, National 
Research Project Report No. L-5, November, 
1939. Pp. 342. 


8. Hollingworth, H. L. Judging human character. 
New York: D. Appleton and Co., 1923. Pp. 
263. 

. Jenkins, }.G. Psychology in business and industry. 
New York: John Wiley and Sons, Inc., 1945. 
Pp. 72-78. 

. Jucius, Michael J. Personnel management. Chi- 
cago: Richard D. Irwin, Inc., 1947, Pp. 175- 
200. 

. Knowles, A. S., and Thompson, R. D. Industrial 
management. New York: The Macmillan Co., 
1944. Pp. 334-336. 

. Laird, D. A. The psychology of selecting employees. 
New York: McGraw-Hill Book Co., Inc., 1937 
(3rd ed.). Pp. 99-119. 

. Link, H. C. Employment psychology. 
The Macmillan Co., 1921. Pp. 435. 

. Maier, N. R. F. Psychology in industry. Cam- 
bridge: The Riverside Press, 1946. Pp. 426. 

. Myers, C. A., and Maclaurin, W. R. The move- 
ment of factory workers. New York: John Wiley 
and Sons, Inc., 1943. Pp. 111. 

. Noland, E. W., and Bakke, E.W. Workers wanted. 
New York: Harper and Brothers, 1949. Pp. 
224. 

. Palmer, G. L. The reliability of response in labor 
marke inquiries. Washington, D. C.: Technical 
Paper No. 22, Executive Office of the President, 
Bureau of the Budget, Division of Statistical 
Standards, July, 1942. 

. Pigors, P., and Myers, C. A. Personne! adminis- 
tration. New York: McGraw-Hill Book Co., 
Inc., 1947. Pp. 497. 

. Reynolds, L. G., and Shister, J. Job horizons. 
New York: Harper and Brothers, 1949. Pp. 
102. 

. Scott, W. D., Clothier, R. C., et al. Personnel 
management. New York: McGraw-Hill Book 
Co., Inc., 1941. 

. Shefferman, N. W. Employment methods. New 
York: The Ronald Press Co., 1920. 

. Stead, W. H., and Shartle, C. L. Occupational 
counseling techniques. New York: American 
Book Co., 1940. Pp. 212. 

. Tead, O., and Metcalf, H. C. Personnel adminis- 
tration. New York: McGraw-Hill Book Co., 
Inc., 1933 (3rd ed.). 

. Tiffia, J. industrial psychology. 
tice-Hall, Inc., 1947 (2nd ed.). 

. Toops, H. A. The returns from follow-up letters 
to questionnaires. J. appl. Psychol., 1926, 10, 
92-101. 

. Viteles, M.S. Industrial psychology. 
W. W. Norton and Co., Inc., 1932. 

. Walters, J. E. Personnel relations. 
The Ronald Press Co., 1945. 

Yoder, D. Personnel management and industrial 
relations. New York: Prentice-Hall, Inc., 1948 
(3rd ed.). Pp. 894. 

. Yoder, D., and Paterson, D. G., et al. Local labor 
market research. Minneapolis: University of 
Minnesota Press, 1948. 


New York: 


New York: Pren- 


New York: 


New York: 














Study of Executive Leadership in Business. 
II. Social Group Patterns 


C. G. Browne 


Wayne L’niversity 


This is the second in a series of articles 
presenting the following methods for the study 
of executive relationships and leadership in 
business: RAD index,’ social group patterns 
and organizational contacts, sociometric pat- 
tern, and Goal and Achievement index. 

The total study proceeded on the following 
hypotheses: (1) leadership is a process based 
upon the inter-relationships of individuals in a 
group that is working toward a goal which has 
been accepted as desirable; (2) executive func- 
tion and leadership in business are processes of 
the interaction of social and working relation- 
ships within and outside of the executive 
groups; (3) executive and leader relationships 
can be analyzed through the application of 
methods which are not designed to meas- 
ure personal executive traits as psychological 
entities. 

The subjects for the study were 24 execu- 
tives in a tire and rubber manufacturing com- 
pany, representing all but one of the executives 
on the first, second, third, and fourth echelons 
of the business. They were classified into the 
following departmental groups: general admin- 
istration, 4 cases; sales, 6; finance, 4; manu- 
facturing, 8; and personnel, 2. 


Social Group Patterns 


As part of a longer interview, each executive 
indicated his first, second, and third choices of 
the following social groups in terms of their 
importance in his social life: group 1, indi- 
viduals within the company, called the Com- 
pany group; group 2, individuals outside of 
the company but with whom there was a 
business affiliation with the executive’s own 
work and called the Outside and Business 
group; and group 3, individuals outside of the 
company with whom there was no business 
affiliation, called the Outside Only group. 
‘! Browne, C. G. Study of executive leadership in 


business. I, The RAD index. J. appl. Psychol., 1949, 
33, 521-526 


12 


Each executive also indicated the amount of 
social activity which he had with each social 
group using a three-point scale, the intervals 
designated as “large amount,” “some,” and 
“no” social life. 

In calculating the score on social group 
patterns for each executive, the first social 
group choice was given six points, that is, the 
social group which was most important in the 
social life of the executive; second choice was 
given four points; and third choice, two points. 
For the amount of activity with each social 
group, “large” was assigned four points; 
“some,” two points; and “no,” zero points. 
The score for each executive was the product 
of the points allowed for the order of each 
social group choice and the amount of social 
activity with each group. 

In Table 1 the social group pattern scores 
are tabulated by total and departmental 
groupings of executives.? For.the total execu- 
tive group and the sales, ‘finance, and manu- 
facturing departments, the three social groups 
ranked Outside Only, Company, Outside and 
Business. However, the general administra- 
tion group ranked the Company social group 
first, followed in order by Outside and Business 
and Outside Only. Whereas Outside and 
Business was the second social group with the 
general administration executives, it was the 
third or least important group with all of the 
other executive groups and the total group. 
However, Outside and Business was a relatively 
more important social group with the sales 
executives than it was with finance or manu- 
facturing. The finance executives indicated no 
Outside and Business social contacts, but their 
score for Outside Only social contacts was 
higher than for any other executive group. 

In their total social activities, the general 
administration executives had the highest 
standing, followed in order by sales, manu- 


* The personnel executives were included in the manu- 
facturing group 





Study of Executive Leadership in Business. II 13 


Table 1 


Social Group Paitern Scores Tabulated by Total and 
Departmental Groups of Executives 





Executive 
Depart- 
ment N 
Total 11.2 8.5 268 
GA* 18.0 20.0 72 
Ss 10.0 8.0 oo 
F 8.0 8.0 32 
M 10.4 8.6 104 


Median Total 


Mean 
, Score Points 


Score 


Outside and Total 5.2 0.9 124 
Business GA 11.0 10.0 44 


S 9.3 4.0 56 
F 0.0 0.0 0 
M 1.6 08 16 


Outside Only 13.5 12.0 
8.0 8.0 
13.3 16.0 
15.0 12.3 


10 13.6 11.7 


* Code: GA—-General Administration; S—Sales; F— 
Finance; M—Manufacturing. 


facturing, and finance. Both general admin- 
istration and sales were above the mean and the 
median for the total group, while finance and 
manufacturing were below. 

From this analysis it appeared that a 
general social pattern developed which was 
based upon the type of work which the execu- 
tive did or the department to which he was 
assigned. This pattern centered around the 
relative importance of the company or business 
contacts in social activities, the increase in 
their relative importance being observed in 
the general administration and sales depart- 
ments. On the other hand, there was a com- 
plete absence of social business contacts in the 
finance department. A consideration of the 
individual executive’s duties and work per- 
formed revealed that the social group choices 
followed the lines of necessity to a large degree 
in that certain types of social activity became 
necessary and essential to the performance of 
the individual as well as the departmental 
functions. 

In Table 2, the social group pattern scores 
are tabulated by echelon level in the business. 


es Pas ee tel : bach aaa we 


The president and general manager, who con- 
stituted echelon one, had the highest possible 
score for each of the three social groups, indi- 
cating a large amount of social life with each 
social group. For him, the groups ranked as 
follows: Outside and Business, Company, Out- 
side Only. 

Echelon two was second in the total amount 
of social activity score, with the social groups 
ranked Company, Outside Only, Outside and 
Business. Echelon four preceded three in the 
total amount of social activity, this being 
strongly influenced by the scores of four sales 
executives who were in the fourth echelon. 
It was observed in the previous section that 
sales executives tended to have more social life. 
Both echelons three and four ranked the social 
groups Outside Only, Company, Outside and 
Business. 

Thus, the executives at higher echelons ex- 
hibited a closer identification with the company 
through their social activities, the Company 
and Outside and Business social groups being 
more important than the Outside Only group. 
When this is combined with the departmental 
pattern of social choices, it appears that execu- 
tives in general administration and sales and 
those in the higher echelons throughout the 
business had more social activity, particularly 
with those social groups which had a relation- 
ship with company operations. 


Table 2 


Social Group Pattern Scores Tabulated by 
Executive Echelon Level 
Echelon 


Level 


Median Total 
Score Points 





Social 
Group 


Mean 
Score 





Company One 16.0 16.0 16 
Two 14.3 8.7 100 
Three 9.2 8.3 92 
Four 10.0 8.3 00 
Outside and One 24.0 24.0 24 
Two 7.0 3.5 48 
Three 2.0 0.6 


Four 3.3 3.7 


Business 


Outside Only One 8.0 
10.3 
15.2 


15.3 


Two 
Three 
Four 

















C. G. Browne 


Organizational Contacts 


To study organizational contacts, each 
executive named all of the professional and 
social organizations to which he belonged, and 
indicated which memberships, if any, were 
paid by the company. Table 3 presents these 
results by executive department and echelon. 
The general administration group indicated 
the greatest number of social organization 
memberships per executive, followed by per- 
sonnel, sales, finance, and manufacturing. It 
should be remembered, however, that the per- 
sonnel group consisted of only two cases. 
However, the company was paying for social 
memberships for certain executives in the 
general administration and sales areas only. 
A study of the duties of these executives re- 
vealed that their activities with social organiza- 
tions constituted an important aid to their 
company functioning. Obviously, then, the 
company was willing to carry the expense of at 
least some of these organizations. 

The personnel group ranked highest in the 
number of professional organizations per execu- 
tive, followed by general administration, 
finance, manufacturing, and sales. Of the six 
executives in the sales group, only the vice- 
president-sales and the sales manager belonged 
to any professional organizations. Each was 
a member of one organization, but these mem- 


berships were not paid by the company. In 
finance, both the treasurer and the comptroller 
belonged to two professional organizations, and 
in each case, both memberships were paid by 
the company. The chief cost accountant be- 
longed to three professional! organizations, none 
of which were paid by the company. In the 
manufacturing group, although 12 professional 
memberships were held, only one of them, for 
the chief chemist, was paid by the company. 

When the organizational memberships, and 
memberships paid by the company were ana- 
lyzed on the basis of executive echelon groups, 
Table 3 shows a decrease in each variable from 
echelon one through echelon four. That is, 
executives in successively lower echelons be- 
longed to fewer social organizations, fewer pro- 
fessional organizations, and fewer memberships 
were paid by the company for each type of 
membership. It might be said that as an ex- 
ecutive advanced in this company into higher 
echelons, it could have been expected that he 
would become a member of more organizations, 
and that his work would be such that the 
company would be willing to pay for more‘of 
these memberships. On the other hand, it may 
have been that the organizational activities of 
these executives were characteristic of the in- 
dividual executive operating in his own parti- 
cular pattern and were not a particular function 
of his position in the business. 


Table 3 


Membership in Social and Professional Organizations Tabulated by Executive Department and Echelon 








Social 
Memberships 
(mean) 





Paid by 


Number Co. 


Executive Departmental Groups 


Total group 2.46 A6 
4.50 1.75 
2.50 67 
1.75 0.00 
1.50 0,00 


3.50 0.00 


Genl, Adm 
Sales 

Finance 
Manufacturing 
Personnel 


Professional 
Memberships 
(mean) 


Total 
Memberships 
(mean) 








Paid by 


Paid by 
Co. 


Co. 


Number Number 


1.67 67 
2.75 1.75 

33 0.00 
1.75 1.00 
1.50 12 
4.00 2.00 


4.13 1.13 
6.25 3.50 
2.83 67 
3.50 1,00 
3.00 12 

2.00 


Executive Echelon Groups 


Echelon 1 
Echelon 2 
Echelon 3 
Echelon 4 


10.00 5.00 
2.57 -86 
2.10 0.00 
1.67 0.00 


4.00 4.00 9.00 
1.86 1.00 . 1.86 
1.80 40 . 40 

67 17 : A 





Study of Executive Leadership in Business. I1 


Conclusions 


From this study of the social group patterns 
and the organizational contacts of a population 
of 24 executives in a tire and rubber manufac- 
turing company, the following hypotheses may 
be advanced: 

1. The social group choices and the amount 
of social activity of a business executive are in 
part determined by: (1) the work which he 
does within the business; (2) the department 
in which he works; (3) the echelon level into 
which he is classified within the organization. 

2. Among the characteristics of executive 
performance and leadership in business are 
membership in social and professional organiza- 
tions and memberships paid by the company, 
the number of all of these variables increasing 
as the executive advances in echelon level. 

The question of whether the variables 
studied were a function of the job or the indi- 


“at ha ede 4 ace peckd ee 


vidual in the job can be answered for business 
leadership in general only with horizontal 
studies of many executives in a wide variety 
of companies or with longitudinal studies over 
an extended period of time with specific com- 
panies or with a given group of executives. 

For the present, however, the discussion 
presented here may prove to be of value in the 
selection and training of business executives 
since it suggests that (1) working relationships 
at higher echelons and in certain departments 
may extend beyond the confines of the office 
and the factory; (2) the individual who is 
aspiring to higher echelons or to certain de- 
partments may need to accept the probability 
that his executive duties for the company will 
extend to his social life; and (3) at least part of 
an executive's social life will be planned for the 
benefit which it will yield to the company and 
his own success in it. 


Received May 17, 1949. 





4 
t 
: 
5 
? 
i 
4 








Readability and Interest Values in an Employee Handbook * 


James N. Farr 
University of Minnesota 


Adequate communication between manage- 
ment and workers is essential to the success of 
modern personnel programs. The basis of 
such programs is an informed work force; a 
work force informed about company policies, 
plans, and competitive position; a work force 
with information as to the rules governing the 
organization of which they are a part, and an 
explanation of the “why”’ of these rules. 

The means developed for this sharing of in- 
formation include employee handbooks, house 
organs, company newspapers, bulletin boards, 
etc. These media are important, and much 
time and money are spent in their production. 
There is a growing literature dealing with the 
functions of these media, and the problems 
peculiar to each. There are available detailed 
discussions of how to prepare such media for 
publication, dealing with every step from for- 
mulation of editorial policy to distribution of 
the finished product. 

Such discussions, adequate though they 
are for the problems discussed, do not go far 
enough, Distribution of the finished hand- 
book or house organ is not the final step of the 
process. The process is not complete until 
the employees read what has been written 
and understand what they read! 

Two aspects of the publication are im- 
portant in this final step of the process; the 
attention and interest value of the publication, 
and the readability of the material in the 
publication. The fact that employee publi- 
cations generally contain pictures and illustra- 
tions is an indication that the attention and 
interest problem has been recognized and is 
being met in part,—but only in part. It can 
never be wholly met so long as the second 
aspect—readability—is neglected, for a reader 
will not maintain interest in material that is 
difficult for him to understand. 

* For an example of an excellent employee handbook 
the reader is referred to a handbook for Northwest 
Airlines employees entitled “Hello! I’m Topper.” It 
was written for Northwest Airlines, Inc. by Mr. Wendell 


Knowles, and was the source for a number of the ideas 
used in the handhook discussed in this article 


Review of Literature 


That the problem of readability is being 
neglected is evident. It is scarcely touched 
upon in the literature dealing with employee 
publications. Heron (5) who gives us an ex- 
cellent discussion of the purposes, uses, and 
problems of communications media tells us 
only that the material must be readable and 
understandable. The National Foremen’s in- 
stitute (6) in its manual on how to prepare an 
employee handbook discusses in great detail 
the content and technical problems to be met 
in the publication. The problem of reada- 
bility, however, is not discussed except for a 
statement that “two-dollar” words should be 
avoided, and that the writer should strive for 
simplicity. Biklen and Breth (2) omit dis- 
cussion of the readability problem in their 
excellent book on employee publications. 
Bentley (1) in his book which deals in great 
detail with problems of editing an employee 
publication, dismisses the readability problem 
with the recommendation that newspaper 
style writing be used. This presumably 
means simplicity, familiar words, short sen- 
tences, and not too many adjectives and 
adverbs. 

in view of the fact that the achievement 
of readable material in employee publications 
is essential if it is to have any value at all, the 
problem deserves much more careful considera- 
tion than this. What good is done by a 
publication which is perfect in its technical 
aspects and content, but which cannot be read 
with understanding by a considerable portion 
of the intended audience? 

Rarely is any information given as to how 
to achieve readability. Apparently it is as- 
sumed that the writer of a handbook or house 
organ need only be aware of the fact that his 
material should be readable and that he will 
then see to it that it is so. Nothing is further 
from the truth. Seeing to it that a handbook 
or house organ is readable and will be read re- 
quires just as much effort and thought as 





Readability and Interest Values in an Employee Handbook 17 


seeing to it that it includes the correct content 
and expresses the desired point of view. This 
fact is often overlooked, or not realized at all. 
Any personnel man who writes a handbook 
knows intuitively that he writes in a readable 
style. After all, it seems perfectly clear to 
him, and his colleagues all seem to understand 
what he has written. This is true, but the 
fact of the matter is that the handbook is not 
being written for his colleagues, but for the 
main body of the firm’s employees. These 
people are not college graduates—a good many 
of them are not even eighth grade graduates— 
and material written by a college graduate, 
and understood by his colleagues, is not so 
readily understood by these workers. In fact, 
it may seem so difficult to them that they do 
not even attempt to read it. 

This fact, that information published for 
workers must be written in language they can 
understand, if it is to hold their interest, seems 
to be self evident. Few would argue that this 
is not necessary. Most, in fact, would argue 
that they consider this factor carefully when 
publishing a handbook, for instance. This is 
no doubt true. Every effort may be made to 
make the handbook readable,—but for whom? 
The difficulty here seems to be in overlooking 
the fact that what is readable for one group of 
people is not necessarily readable for another 
group. The problem is to see that the hand- 
book is readable for the intended reader, not 
for the writer. 

This has been a difficult problem. It is 
easy to look at a piece of writing and tell 
whether it is hard to read or easy to read, in 
terms of your own scale of difficulty. It is 
much harder to judge the difficulty level in 
terms of the reading ability of someone else, 
such as a workman who has had an eighth 
grade education. Fortunately we now have 
several objective methods of estimating the 
reading difficulty level of written material. 
Application of the readability formulae devel- 
oped by Flesch (3, 4, 4a), for example, gives a 
good estimate of reading difficulty, and of the 
educational level of those who will be able to 
read the material with understanding. 

If the application of the Flesch formulae 
indicates that the material is written at a 
difficulty level too high for the intended audi- 
ence, the chief problem then arises: how to 


fh Sst wae er es te 


write at a level below that on which you are 
accustomed to write without seeming to “write 
down.” Flesch (3, 4a) has made this problem 
much easier to solve. By following the rules 
he has laid down it is possible with practice to 
produce material which is both readable and 
well written. 

Application of the Flesch formulae to man- 
agement communications indicates that very 
often the material is written on a difficulty level 
far beyond the comprehension of the group to 
whom it is directed. Paterson and Jenkins 
(7) in their analysis of an information sheet 
for potential applicants in one large firm found 
that it was far too difficult for that group. The 
writer found in an analysis of 25 company 
house organs, and 25 union papers, that they 
were writteri at a level that effectively shut 
out a large portion of the intended readers. 

This is obviously a ridiculous situation. It 
is futile to expend large amounts of time, 
energy, and money to publish a handbook or 
house organ which will not, or cannot be read 
with understanding by many of those for whom 
it is prepared. 


Proposed Employee Handbook 


Recently the writer was asked to analyze 
and revise a proposed employee handbook for 
a textile firm. The results of the analysis, 
and the steps taken in the revision to ensure a 
readable handbook will be discussed in the 
following pages. 

Before beginning the revision, Flesch’s 
readability formulae were applied to samples 
drawn from the proposed handbook. This 
sampling indicated that the difficulty level of 
the writing was far too high for the group of 
employees to whom it was to be given. 

The first step, therefore, was to rewrite the 
handbook in much simpler language. This 
was done by following the rules given by Flesch 
(3, 4a). Long sentences were broken up into 
short simple sentences; words of one or two 
syllables were substituted for longer words 
wherever possible; statements were addressed 
directly to the reader wherever possible; and 
the material was personalized by the use of 
personal pronouns. 

This resulted in a handbook which was 
much easier to read, but which was far too 


SS ee a mig toe er RE ETS (RE OTTO CES Ce 


co DIETS © 








18 James \. Farr 


long. It was now in a form which the workers 
could read but not in a form which they likely 
would read. For even though a handbook is 
written at a level the worker can understand, 
it will not hold his interest if it consists of page 
after page of words. It must be remembered 
that though the facts presented in a handbook 
are of value to a worker, they may not seem 
to him to be of immediate value. He will 
take them if they are easy to come by, but he 


will not go out of his way to sift these facts ° 


from paragraphs of useless words. 

For this reason, a handbook should contain 
as few words as you can get by with rather 
than all that you can crowd into it. 
of words wherever possible. Put in a picture 
instead. Or leave white space. If nothing 
else, this will make the important facts and 
ideas stand out so that some of them get across 
even in a cursory reading. 

The second step, then, was to strip away all 
excess words which served only to hide the 
ideas that were to be presented. The result of 
this step was to reduce the number of words 
from approximately 10,400 to approximately 
2,800, and the number of pages from 39 to 28. 

The revision now contained 28 pages with 
an average of 100 words per page. This meant 
there was ample white space, and ample room 
for pictures or illustrations. In the fourth 
step of the revision, illustrations were added. 

The illustrations used were simple line 
figures. They serve to catch attention and to 
emphasize the rules or ideas being discussed. 
They also offer a painless method of “laying 
down the law.’ Consider the example in 
Figure 1. 





HE WAS LATE 
TIME CA SO OFTEN HE 
the PASSED ON ONE 
DAY. 














~ 











Fig. 1. Example of line drawing to illustrate the 
company rule regarding excessive tardiness. 


Get rid 


This is much less harsh than a statement to 
the effect that “if you are late too often, 
disciplinary action will be taken.” 

The revision was now complete. The 
handbook had been rewritten in simple lan- 
guage, excess words had been discarded, leaving 
the principal ideas emphasized on uncrowded 
pages, and simple illustrations had been added. 

Below is reproduced a page from the 
original handbook, followed by the page from 
the revised handbook which dealt with the 
same subject matter. 


Incentive Pay 

The Company has installed wherever possible 
an incentive or piece work plan as a means of giving 
additional pay to those producing more than the 
average for a normal day’s work. It is important 
that you, as an employee of the company, be fully 
familiar with the operation of this plan. 

A trained time study person observes the opera- 
tion being performed by an operator. The job is 
broken down into the work elements of which it is 
composed and a time study person is trained to 
recognize unusually good or below average effort 
and ability. If the time required to perform an 
operation is less than norma! expectancy, the em- 
ployee benefits from the extra effort. If an em- 
ployee is working slower than normal, the reverse 
would be true. Provisien is made for normal delays 
under personal and machine allowances. 

Time standards are guaranteed against change 
as long as a method of operation and conditions 
surrounding the operation remain unchanged. How- 
ever, in those instances where either the operator 
or the company feels that a rate is too low or too 
high, a re-timing may be requested and a change in 
rate negotiated. Any complaint on rates should 
be brought to the attention of your department 
supervisor. Time studies will be made in all in- 
stances where the department supervisor and the 
person in charge of time study activities feel that a 
re-timing should be made. 

The management encourages any questions or 
sugzestions concerning the operation of our incen- 
tive program We are always ready to accept 
constructive comments on this subject because in 
the final analysis it will be mutually advantageous. 


Compare the two sample pages shown. 
Can there be any doubt as to which can be 
best understood by the worker? Or which is 
the more likely to be read with attention? 

Consider the contents of the two pages. 
The first contains 278 words. The revised 
page contains 100 words. A glance will show 
that the difficulty level of the words used in 
the revised page is far below that of the first 
sample. It is far easier to read the revised 





Readability and Interest Values in an Employee Handbook 19 


page with its abundance of white space. And 
the drawing and “Reward” box, simple as 
they are, are attention catching features. 

It may be argued that the revised page has 
omitted some of the information contained in 
the original. That is true. The essential in- 
formation is there, however. It must be borne 
in mind when comparing these two pages that 
a large number of the employees who were to 
receive this handbook had only an eighth 
grade education. The writer believes that it 
is better to present the essential idea in a way 
that will get it across than to present a large 
amount of information which may not even 
be read. 

A comparative analysis of the original and 
revised handbooks was made, using Flesch’s 
new Readability Formulae (4). Thirteen 100- 


| 


} 
| Reward! | 
| | 


For | 
those who | 

| do better | 

| than 

| average 

| work 


Our 
} of work. 


J 
| Reward! | 


Here’s how it works. 


iece-work pian gives you more 
pay for “better t 


word samples were drawn from the original, 
and fourteen 100-word samples were drawn 
from the revised handbook. The results of 
the application of the formulae to these samples 
are shown in Tables 1 and 2. 

The average Readability Score for the 
original handbook was 59.2, a difficulty level 
requiring some high school for adequate under- 
standing. The scores ranged from “Difficult” 
to “Fairly Easy” with 8 of the 13 scores “Fairly 
Difficult” or harder. This means that over 
half of the samples drawn from the original 
handbook required “some high school” or more 
education for adequate understanding. In 
view of the fact that a large number of the em- 
ployees who were to receive the handbook had 
eighth grade or less education, it is evident 
that its effectiveness would have been limited. 


a 


n average” amount 





A time-study man studies your job. He 
finds out how long the average worker 
takes to do it. 


Then the piece-work rate is set so that 
if you are an average worker you earn a 
fair day’s pay. 





If you are better than the average 


and take less than the average time to 
do the job-——you earn extra pay. 


YES S/R, EXTRA 
EFFORT PAYS OFF /N 


EXTRA DOLLARS 
AT .000s00060 











Fig. 2. Sample page from revised handbook. 








James N. Farr 


Table 1 
Flesch teat at Scores and sine atenmnanet for ds sata Drawn From ee and Revised Handbooks 





Tabulation 


Re 
vised 


* Readability 
Score 


( Sela 
inal 


0 to 30 
30 to 530 
50 to 
@ to 70 
70 to 80 
80 to 90 
90 to 100 


Average 


_Style 


Very difficult 
Difficult 
Fairly difficult 
Standard 
Fairly easy 
Easy 

Very easy 





School Grade Level 
of Potential 
Audience 


Typical 
Magazine 


Scientific 
Academic 
Quality 
Digests 

Slicks —fiction 
Pulp 


Comics 


College 

HS. or some college 
Some H.S. 

7th or Sth grade 
6th grade 

5th grace 

4th grade 


fiction 


Table 2 


Flesch Human Interest Scores and pene eset for cheetahs Drawn From he 0008 and Revised Handbook 


T abalation 


Orig- Re- 
inal vised 


Human 
Interest 

Score 

0 to 10 
10 to 20 
20 to 40 
40 to 60 
60 to 100 


Average 


e 


The average Readability Score for the re- 
vised handbook was 87.7, a difficulty level 
requiring only a fifth grade education for 
adequate understanding. The scores for the 
samples ranged from ‘Fairly Easy” to “Very 
Easy.” Very few employees would be unable 
to understand material written at this level. 

The Human Interest scores were also 
higher for the revised handbook, as shown in 
Table 2. This means that the revised hand- 
book contained more personal pronouns and 
personal sentences than did the original. This, 
according to Flesch, makes the material more 
interesting to read. 


Summary 


1. Recognition of the importance of com- 
munication between management and workers 
has resulted in an increase in the use of 


Description 


Typical 
of Style 


Magazine 


Scien! tific 
Trade 
Digests 

New Yorker 
Fiction 


Dull 

Mildly interesting 
Interesting 
Highly interesting 
Dramatic 


written media such as house organs, hand- 
books, etc. Discussions of the content, pur- 
poses, and uses of these media are available 
in the literature. Discussions of the technical 
problems of publication are also available. 

2. The important problem of readability in 
these written communications has not been 
dealt with, however. There is little in the 
communications literature dealing with the 
need to write at the level the worker can un- 
derstand, and little dealing with the problem 
of how to write readable material. 

3. The Readability Formulae of Flesch now 
provide an objective method of evaluating the 
difficulty level of written material, and the 
rules laid down by Flesch point the way to 
writing readable copy. 

4. Application of the Readability Formulae 
to management-employee communications in- 
dicates that they are frequently written at a 





Readability and Interest Values in an Employee Handbook 


difficulty level far too high for the intended 
audience. 

5. Application of the formulae to a proposed 
employee handbook classified it as “Fairly 
Difficult,” requiring “some high school” read- 
ing ability for adequate comprehension. Many 
of the workers who were to read the handbook, 
however, had only eighth grade or less educa- 
tion. The handbook was rewritten, using 
short sentences, short common words, and 
more personal words. Excess words were dis- 
carded, leaving more white space per page, 
thus making it easier to read. Flesch analysis 
classified the material as “Easy,” requiring 
fifth grade reading ability for understanding. 


Received A pril 16, 1949 


Ce any ee *S TERE or eee Pore tre 


References 


1. Bentley, G. How to edit an employee publication. 
New York: Harper & Brothers Publishers, 1944. 

2. Biklen, P. F., and Breth, D. B. ‘ Successful employee 
publication. New York: McGraw-Hill Book 
Company, Inc., 1945. 

3. Flesch, R. The art of plain talk. 
Harper and Brothers, 1946. 

4. Flesch, R. A new readability yardstick. J. appl. 
Psychol., 1948, 32, 221-233. 

4a. Flesch, R. The art of readable writing. 
Harper and Brothers, 1949. 

5. Heron, A. R. Sharing information with employees. 

Stanford: Stanford University Press, 1942. 

erican Management Association. How to prepare 

nd publish an employee manual. New York, 

i946. 

7. Paterson, D. G., and Jenkins, J. J. Communica- 
tion between management and workers. J. appl. 
Psychol., 1948, 32, 71-80. 


New York: 


New York: 


4 


. 
4 
: 
+ 
2 
7 
“g 
. 
q 
: 
; 
} 
: 
- 
¥ 
e 
: 
z 
3 
3 
' 








Reliability of the Flesch Readability Formulas 


Patricia M. Hayes, James J. Jenkins, and Bradley J. Walker * 
Department of Psychology, University of Minnesota 


A formula developed by Flesch (3) for esti- 
mating the comprehension difficulty of written 
material has received widespread attention in 
many areas of communication. It has been 
applied in the fields of journalism (7), adver- 
tising (1), industrial communications (5), gov- 
ernment publications (8) and many others. 

In view of the wide use of his formula,Flesch 
(4) published a revision in 1948 designed to in- 
crease its utility and make its application and 
interpretation easier. This revision proposes 
the use of two formulas to measure two rela- 
tively independent aspects of readability. 
The first formula involves word length (wl) and 
and sentence length (s/) and gives a measure 
of “reading ease” (RE). ‘The second formula, 
based on personal words (pw) and personal 
sentences (ps), yields a measure of “human 
interest” (HI). 

As these formulas become increasingly 
popular, they must, of course, be evaluated 
critically. Like other psychological tools, they 
must be tested for validity and reliability. 
Flesch (4) reports several studies of the validity 
of the original formula which indicate that 
material rated more readable by the formula 
also proves more readable in terms of reader- 
ship surveys and opinions of judges. As yet, 
however, no studies of the reliability of the 
formulas as applied by different analysts have 
been reported. 

The following studies were designed as first 
steps in the examination of analyst-to-analyst 
reliability of the formulas to determine the 
extent to which they are effectively objective. 


First Study 


The material chosen for analysis in the first 
study consisted of the 40 prize-winning letters 
in the recent General Motors’ “Why I Like 
My Job” contest (9). These letters were 
selected because they presented a wide range 

* Hayes and Jenkins are primarily responsible for the 


first study in this paper and Walker is prirnarily re- 
sponsible for the second 


of difficulty, style, structure and content. It 
was believed the letters would afford a maxi- 
mum number of problems in interpretation and 
would provide a rigid test of objectivity. 

Two sets of samples were drawn from the 
40 letters. Each set consisted of two 100- 
word samples from each letter. Since the 
letters ranged in length from about 350 to 3000 
words with a median length of 750 words, there 
was little overlapping between the sets of 
samples. 

Two experienced and two inexperienced 
analysts participated in the study.! The expe- 
rienced workers analyzed both sets of samples; 
the inexperienced each analyzed one set. The 
experienced analysts had worked with the 
original formula and the revised formulas for 
a year and a half. The inexperienced analysts 
had never worked with the formulas before. 
The analysts made no attempt to agree on 
interpretation of Flesch’s instructions and re- 
frained from discussing interpretation with 
anyone else. 

Reading ease and human interest scores 
were computed from tables (2) in an effort to 
minimize computational errors. Results of 
analyses made by different investigators on 
the same set of samples were compared by 
determining: the significance of differences be- 
tween means of the four variables (wl, si, pw, 
ps) and the two scores (HI, RE) for the set of 
samples as a whole; the extent of correlation 
between results of analysts; the number and 
degree of differences in actual scores for each 
sample; and the number of differences in de- 
scriptive categories assigned to each sample. 


Results of First Study 


The assumption of wide variability in the 
material used was confirmed. As may be 
seen in Table 1, samples ranged from 6 to 88 
on the reading ease scale of 0 to 100 points and 


! The writers would like to acknowledge the assistance 
of Barbara Lee and James Farr in this part of the 
project 





Reliability of the Flesch Readability Formulas 23 


Table 1 
Means, Standard Deviations and Ranges for the Four 
Variables and Two Scores of the Flesch Reada- 
bility Formulas for Each Analyst 


Ana- 
lyst —_w/ sl 


A 145.0 
B 143.2 
C 145.0 


144.0 
143.8 
14.3 


11.9 . 3.5 
11.8 od 3.3 
11.7 3.7 


26.6 
17.6 
18.4 


18,2 
16.4 
15.6 


94 2.9 
9.0 9.8 2.8 
9.7 9.9 3.4 


21.4 
26.7 
21.1 


14.3 
13.6 
14.0 


124-174 


A 11-72 7-85 4-20 0-100 15-100 
B 125-174 10-72 
C 124-173 
B 


6-84 4-20 0- 83 13- 97 
11-72 6-82 5-18 0 91 18- 94 


12-80 42-81 5-15 0-100 16- 79 
11-74 42-88 5-17 0-100 18- 80 
12-74 42-86 5-15 0-100 16- 80 


122-164 
C 122-162 
D 123-164 


from 13 to 100 on the human interest scale of 
0 to 100 points. 

The means of the four variables and two 
scores obtained by analysts on the same sets 
of samples were tested for significant differences 
by use of the critical ratio corrected for correla- 
tion. None of the differences between any 
analysts in either sample set proved to be 
significant at the five per cent level. 

Rank difference correlations were computed 
between each pair of analysts within each 
sample set on the rank given each letter. 
These correlation coefficients are presented in 
Table 2. All of the correlations are positive 
and significantly different from zero beyond 
the one per cent level. 

An inspection of Table 2 indicates that 
analysts were in good agreement in interpreting 
the components and final score for reading 
ease. For the human interest variables, how- 
ever, there was much less agreement between 
analysts. Personal words were apparently in- 


=) quam earnehe fe 


terpreted much the same by analysts, but it is 
evident there were diverse interpretations of 
personal sentences. This acts, of course, to 
lower the correlations of the human interest 
score. 

It should be noted that correlations be- 
tween experienced analysts (B and C) are not 
appreciably different from those with inexperi- 
enced analysts (A and D). 

Since neither of the statistical methods 
presented above reveals the actual differences 
between analysts for a given sample, a third 
kind of comparison was made, Within each 
sample set all analysts were compared. on 
reading ease and human interest scores for 
each letter. Actual point differences for each 
pair of analysts were tabulated. Results of 
the 240 comparisons are shown in Table 3. 

This table also suggests that there is greater 
agreement on reading ease (90 per cent of the 
comparisons within four points of each other) 
than on human interest (90 per cent of the 
comparisons within eight points of each other). 
On the 100-point scale designed to be used as 
an estimating device, deviations as small as 
these do not appear to be of great importance, 

Again it should be mentioned that no con- 
sistent difference was found in the number or 
extent of deviations between scores of experi- 


Table 2 


Rank Order Correlation Coefficients between Pairs of 
Analysts for the Variables and Scores of the 
Flesch Readability Formulas * 


Sample Set 2 





Sample Set 1 
Ana- Aand Aand Band Cand Band Band 
lysts** B Cc 4 D D cS 


wl »” » 29 99 29 ” 
sl 94 97 94 83 87 OF 
RE } 99 .98 93 95 7 


pu 93 94 96 ” 
ps 8 =O 74 37) 
HI 8 991 .% 97 8 


* These correlations should be interpreted with cau- 
tion since the data aré markedly skewed in the case of 
ps and HI. It should be noted, however, that the 
order of relative accuracy for the scores is the same 
whether correlations, point differences or category 
differences are considered. 

** Analysts A and D are inexperienced; B and C, 
experienced. 














24 Patricia M. Hayes, James J. Tenkins, and Bradley J. Walker 


enced analysts compared to their deviations 
with scores of inexperienced analysts. 

A final comparison was made between ana- 
lysts in terms of descriptive categories in which 
results are often reported and utilized. Flesch 
(4) divides the reading ease range into seven 
levels varying from “very difficult” to “very 
easy” and the human interest range into five 
levels varying from “dull” to “dramatic.” 


Table 3 


Differences in Score Points Between Analysts 
on Identical Samples * 


Reading Ease Human Interest 


Cumu- PerCent Cumu- 
lative of lative 
Com Percent Com- Percent- 
parisons age parisons age 
1 49.2 49.2 33.3 33.3 
25.8 75.0 18.0 51.3 
9.2 84.2 6.6 57.9 
6.2 90.4 18.4 73.3 
3.8 94.2 4.2 77.5 
1.6 95.8 5.4 82.9 
17 97.5 3.4 86.3 
& 98.3 3.7 90.0 
9 49 99.2 2.5 92.5 
10 . 100.0 8& 93.3 
11 or nore 6.7 100.0 


Da — — — — 


ence in 
Points 


* Based on 240 compar‘sons; three analyses for each 
of 80 samples of 200 words. 


Of the 240 comparisons, in only 14 cases 
(5.8 per cent) did the analysts differ in the 
category assigned to reading ease. In 28 cases 
(11.7 per cent) they differed in the category 
assigned to human interest. Of these cases of 
disagreement, only two (.8 per cent) were 
greater than one category for reading ease, and 
only four (1.7 per cent) were greater than one 
category for human interest 


Discussion 


While the results of this first study seem to 
constitute a limited but fairly clear answer to 
the question of reliability of the Flesch for- 
mulas, mention of the greatest sources of error 
may be of some value in interpreting the data 
and may provide a few hints to those who wish 
to use the formulas. 


The greatest discrepancies obviously appear 
in interpretation of personal sentences. A 
study of Table 2 shows that correlations for 
this variable are especially low when analyst 
“C” is involved. Samples which contributed 
most to the discrepancy between “C” and the 
other analysts were studied. Over half of the 
major differences between “C” and the others 
involved one type of personal sentence defined 
by Flesch (4) as “grammatically incomplete 
sentences whose full meaning has to be inferred 
from the context.” The examples given by 
Flesch (4) appear to be taken from conversa- 
tions, and apparently the definition was _re- 
garded as limited to conversations by analysts 
“A,” “B” and “D.”” It would seem, however, 
that the examples are to this extent misleading 
since conversational sentences are already 
covered by Flesch’s first definition regarding 
spoken sentences. It might be suggested that 
analysts study the definitions carefully and 
that Flesch provide more varied examples. 

A second source of disagreement involved 
rhetorical questions. Analyst “C” did not 
count these as personal sentences, but it 
appears clear from Flesch’s definition that 
these should have been considered and scored 
as the other analysts scored them. 

If these two sources of error (incomplete 
sentences and rhetorical questions) had been 
corrected, correlations for personal sentences 
would have been raised above .90, and the 
human interest scores would have appeared 
much more reliable. 

Errors in personal words were few and 
appear to be due largely to carelessness. 
While there were no consistent errors, from 
time to time one analyst or another tended to 
regard a common-gender noun like “worker” 
or “manager” as a personal word. Flesch’s 
definitions and examples are explicit on this 
point (4). 

Errors in sentence length resulted chiefly 
from disregarding directions on counting the last 
sentence when the sample ends in the middle 
of a sentence and from disagreement on break- 
ing sentences into units of thought. It may 
be noted that the lowest correlations in Table 
2 for sentence length involve analyst “D.” A 
study of the samples yielding the greatest 
discrepancies reveaied that if analyst “D” had 
broken sentences inio units of thought in just 





Reliability of the Flesch Readability Formulas 2 


two instances, correlations would have been 
above .95. Here it appears that more careful 
attention to directions would have assured 
high reliability. 

Errors in word length are all very small and 
appear to reflect minor clerical errors. 


Second Study 


A second study was conducted to test our 
findings with a large number of inexperienced 
analysts. Samples of 500 words from 63 house 
organs and employee publications which were 
being examined in connection with a continuing 
study of industrial communication (6) were 
assigned for analysis to 18 members of a gradu- 
ate seminar in psychology. Each student 
analyzed seven publications which were sub- 
sequently reanalyzed by another member of 
the seminar. Assignments were anonymous 
and cooperation between students was dis- 
couraged. Only three of the students had 
appreciable experience with the formulas prior 
to the time of the study. 


Table 4 


Sampling Statistics of Test and Re-Test Distributions 
for the Second Study 


Standard 


Means Deviation 


Range 


Re Re- Re 
Test Test Tes Test Test 
wi 155.4 154.9 7.98 138-172 140-167 
sl 20.6 20.5 4.15 15-45 13-45 
RE 54.5 55.2 7.71 30-73 31-69 
pw 7.3 6.6 2.24 3-16 2-13 
ps 12.8 13.1 11.45 0-46 0-48 
HI 30.3 28.3 9.56 15-64 8-62 


Test 
7.98 
4.39 
8.40 
2.38 

11.17 

10.74 


The results of the analyses provided pairs 
of scores for each of the publications. The 
first analyses were compared with the second 
analyses to determine the reliability of the 
application of the formulas to the same 
samples. Product moment correlations be- 
tween the “test” and “retest” analyses are as 
follows: wl, .90; sl, .92; RE, .91; pw, .78; ps, .64; 
and HI, .81. All coefficients were positive and 
significantly different from zero. 

The data for the means, standard devia- 
tions and ranges are presented in Table 4. A 
comparison of the standard deviations and the 


ranges in Table 4 with those in Table 1 reveals 
that the material used in the second study was 
appreciably more homogeneous than that used 
in the first. The correlations found in the 
second study, then, might be expected to be 
smaller than those of the first study. Ac- 
cordingly, the correlations presented immedi- 
ately above were “corrected” by estimating 
their magnitude on the basis of the more 
heterogeneous material of the first study. 
This “correction” gave the following coeff- 
cients: wi, .95; sl, .99; RE, .98; pw, .88; ps, 85; 
and HI, .92. 

These correlations approximate those found 
in the first study and would lead one to the 
same conclusions. Reading ease with its com- 
ponents is analyzed quite reliably and human 
interest with its components is analyzed with 
less, though still fair, reliability. Analysis of 
personal sentences again shows the greatest 
lack of agreement between analysts. 

The difference in points between “test” and 
“retest” analyses agrees rather closely with the 
data from the first study given in Table 3. 
Ninety per cent of the paired scores for reading 
ease were within six points of each other and 
approximately ninety per cent of the paired 
scores for human interest were within eight 
points of each other. 


Summary and Conclusions 


An examination of analyst-to-analyst re- 
liability of the Flesch readability formulas was 
conducted. In the first study two sets of 
samples were drawn from reading material of 
a highly variable nature believed to involve a 
large number of problems of interpretation. 
The sets of samples were analyzed by two 
inexperienced and two experienced analysts. 
Results of analysts for each set of samples were 
compared by testing the significance of mean 
differences on the variables and the scores, 
correlating results on the variables and scores, 
tabulating deviations in terms of score points 
and tabulating disagreements in descriptive 
categories. 

In the second study, eighteen students 
analyzed samples of 500 words from 63 indus- 
trial house organs. Each sample was inde- 
pendently analyzed by two analysts. Correla- 
tions between the first and second sets of 











26 Patricia M. Hayes, James J. Jenkins, and Bradley J. Walker 


analyses were computed and then corrected for 
restriction of range. Deviations in terms of 
score points were computed. 

From the above data the following con- 
clusions seem justified: 

1. Analyst-to-analyst reliability on word 
length, sentence length, and reading ease is 
quite high for the kinds of material used in 
this study. 

2. Analyst-to-analyst reliability on personal 
words is fair,.but on personal sentences (and as 
a result on human interest) is lower than might 
ordinarily be considered desirable. 

3. For practical purposes the Flesch for- 
mulas and the directions for their use are 
sufficiently objective to be used even by in- 
experienced analysts to obtain estimates of the 
reading ease and human interest of written 
material. 


Received June &, 1049. 


References 


. Alden, J. Lots of names—short sentences—simple 
words. Printer’s Ink, June 29, 1945, 21-22. 

. Farr, James N., and Jenkins, James J. Tables for 
use with the Flesch readability formulas. J. 
appl. Psychol., 1949, 33, 275-278. 

. Flesch, R. The art of plain talk. New York: 
Harper and Brothers, 1946. 

. Flesch, R. A new readability yardstick. J. appl. 
Psychol., 1948, 32, 221-233. 

. Paterson, D. G., and Jenkins, James J. Communi- 
cation between management and workers. J. 
appl. Psychol., 1948, 32, 71-80. 

. Paterson, D. G., and Walker, B. J. Readability 
and human interest of house organs. Personnel, 
1949, 25, 438-441. 

. Swanson, C. E. Readability and readership: a con- 
trolled experiment. Journ. Quart., 1948, 25, 
339-343. 

How does your writing read? UU. S. Civil Service 
Commission. Washington: U. S. Government 
Printing Office, 1946. 

. The worker speaks. General Motors, 1947. 





The MacQuarrie Test for Mechanical Ability. 
IV. Time and Motion Analysis * 


Charles H. Goodman 


Radio Corporation of America 


This is the fourth’ and last article on the 
use of the MacQuarrie Mechanical Ability Test 
in a radio manufacturing company. This 
study presents the results of a time and motion 
analysis of the test. The purpose of this study 
was to determine whether a time and motion 
analysis could provide insight into the number 
of factors being measured by the test, and 
whether this type of analysis could assist in the 
interpretation of the factors in the test. 

Of the seven MacQuarrie sub-tests, only 
four—Tracing, Tapping, Dotting, and Copying 
—lend themselves to time and motion analysis. 
The reason is that only these four sub-tests 
involve manual movements and could there- 
fore be analyzed by this technique. The par- 
ticular method of time and motion analysis 
used in this study was that of synthesis,’ which 
is defined as a method of determining the select 
time for a given motion pattern by the applica- 
tion of standard moving time values to a de- 
tailed motion analysis. 


Procedure 


Upon completion of the time and motion 
analysis, it was possible to construct Table 1 
which shows the various time and motion 
elements found to be operating in the four sub- 
tests. This analysis also provided the data 
whereby one could determine which elements 
were common among the tests and the fre- 
quency with which these elements occurred 
while carrying out the tasks. 


* The writer wishes to thank Mr. Fred Weber, Time 
Study Engineer, RCA Victor Corporation, for his 
assistance in conducting the time and motion analysis 
for. this study. 

1 Goodman, Charles H. The MacQuarrie test for 
mechanical ability. I. Selecting radio assembly oper 
ators. J. appl. Psychol., 1946, 30, 586-595; II. Factor 
analysis. J. appl. Psychol, 1947, 2, 150-154; III. 
Follow-up study. J. appl. Psychol, 1947, 5, 502-510. 

* Synthesis manual. Radio C orporation of Amer‘ca, 
RCA Victor Division, Camden, N. J., 1944. 


Results 


Overlapping r’s were computed on the basis 
of the data recorded in Table 1, The formula 


used to compute the overlapping r’s was: 
' 


nc 
V(na+nc)(nb+ nc) 

The object of computing these overlapping 
r’s was to determine whether the relationships 
among these four tests as based upon the time- 
study elements, would approximate the rela- 
tionship among the same tests as shown by the 
Pearsonian intercorrelations. Table 2 shows 
the overlapping r’s and their Pearsonian coun- 
terparts. Four of the six overlapping r’s deviate 
from their Pearsonian r’s from .00 to .09. The 
overlapping rf for the Tracing and Tapping Test 
deviates .21 from its corresponding Pearson r. 
The sixth and largest deviation of .23 involves 
the Tapping Test with Copying. The writer 
is unable on the basis of available data to ex- 
plain why these two deviations are larger than 
the other four. 

While it appears that there is considerable 
agreement between the Pearsonian r’s and the 
overlapping 7’s, one cannot lose sight of the 
fact that in using the formula for computing 
the overlapping r’s one does not know if the 
factors operate independently and additively, 
or in a dependent and related manner. While 
this finding is of considerable interest, further 
study would certainly be needed before one 
could attach any significance to it. 

Since it was found that considerable agree- 
ment existed between the Pearsonian r’s and 
the overlapping r’s, the question was raised as 
to whether or not time and motion analysis 
would be helpful in interpreting the factors 
being measured by the MacQuarrie test. 

An earlier study*® had been made by the 


* Goodman, C. H. Op. cit. 














Charles H. Goodman 


Table 1 
Time — Elements ares gomeb in the Tasks Required He! the MacQuarrie Tests 





Transport Loaded 
Simo Inspection 


Tracing 
Tapping 
Dotting 
Copying 


writer in an endeavor to identify the factors in 
the MacQuarrie test by means of factor analy- 
sis. This factorial study showed three factors 
to be operating in the MacQuarrie test. The 
test loadings of the first factor ranged from 
369 to .000. While none of the test loadings 
for this factor were statistically significant,‘ it 
appeared to the writer that some factor was 
operating. The sub-tests of Tracing, Dotting, 


and Pursuit had loadings of .369, .338 and .325 
respectively, while the remaining four sub- 


tests were zero or slightly above zero. While 
the writer was unable to identify the factor 
on the basis of the factor analysis data, time 
and motion analysis data clearly indicated to 
the writer that it was a “visual inspection” 
factor, since the time and motion analysis 
showed that the three largest test loadings on 
this first factor called for considerable amounts 
of visual inspection before the tasks could be 
done. On the other hand, little, if any, visual 
inspection was called for by the remaining four 
sub-tests. From a statistical point of view this 
factor was named only tentatively since none 
of the loadings met the criteria of being as 
large as 40. On the other hand, based on the 
time study analysis, the writer had no hesitancy 
in so naming the factor. It is further the 
opinion of the writer that he could not have 
‘readily identified this factor without the time 
study data. 
The test loadings of the second factor in the 
factorial analysis were statistically significant 
*Thurstone, L. L. Primary mental abilities. Chi- 
cago: University of Chicago Press, 1938 





| Forearm Movement 
| Finger Wrist Forearm 
Upper Arm Movement 


| Finger Wrist 
Elbow Pivot 
Shoulder Pivot 


| Finger Wrist 
Movement 


“| 


' 
{ 


| 
| 


and this factor was named a “spatial” factor. 
Corroborating evidence that this factor was a 
“spatial” factor comes from the work of Thur- 
stone’ and Harrell,® in that their studies also 
showed the MacQuarrie-to be loaded with a 
“spatial” factor. 

The time and motion analysis, in the opinion 
of the writer, did not in this instance provide 
any assistance in identifying this “space” 
factor. The reason for this may be that the 
space factor is psychological in content, pre- 
sumably involves mental processes and there- 
fore escapes the detection of time and motion 
studies, a technique which is only applicable to 
observable acts. 

In naming the third factor disclosed by the 
factorial analysis, the writer found the time 
and motion data of material assistance. It 
was apparent from such analysis that the tests 
carrying the highest loadings involved manual! 
movement. This might readily have been con- 
cluded from mere study of the tests themselves. 
However, further study of the data produced 
by the time study analysis showed that it was 
more than mere manual movement. The time 
study data showed that in carrying out these 
tasks it was necessary to exert careful control 
in carrying out the manual movements. This 
control involved the muscles being used in 
order that motions could be stopped when 
there was need to make alignment and to 
carefully guide the pencil in prescribed areas. 

* Thurstone, L. L. Op. cit. 


* Harrell, W. A factor analysis of ‘mechanical ability 
tests. Psychometrika, 1940, 5. 





The MacQiuarrie Test for Mechanical Ability. 1V 29 


Further evidence to support this view was ob- 
tained from high speed camera photographs 
which showed the motion in operation and the 
control being exercised. It was on the basis of 
this evidence that the writer named this factor 
a “controlled manual movement.” 

Harrell,’ in his studies with the Mac- 
Quarrie, found a factor evidently similar to 
that of the writer. Harrell named this factor 
a “manual agility” factor. On the basis of the 
foregoing evidence, the present writer cannot 
accept the concept and connotation of this 
factor as being a “manual agility” factor. 


Table 2 


Pearsonian Correlations Compared with Correlations 
Computed on the Basis of Overlapping 
Elements 


Dotting 


Tapping 


Pear- Over- Pear- Over- Pear- Over- 
son lapping son lapping son lapping 
r r r r r r 


Tapping . 27 
Dotting  -5: 58 Al 
Copying . 39 31 
As a result of this study it would appear to 
the writer that the technique of time and 
motion analysis might be of considerable assist- 
ance in helping to analyze performance tests 
for the purpose of identifying the factors in- 
volved in such tests. It would seem that the 
area of mechanical dexterity would be a fruitful 
field to explore. One should not lose sight of 
the fact that time and motion analysis will 
probably tend to show a greater number of 
elements in operation than does the statistical 
factorial method. This can be seen in Table 2 
of the present study where some eleven ele- 
ments are shown as compared with the three 
factors disclosed by factor analysis. This may 
be due primarily to the fact that the elements 
7 Harrell, W. Op. cit. 


of time and motion analysis are more minutely 
broken down and that there may be various 
combinations of these elements in a single 
task. However, there is no available evidence 
to prove or disprove this contention. More 
important is the fact that time study elements 
should not be considered as being equal or com- 
parable to the so-called psychological factors 
disclosed by factor analysis. 


Summary 


This study has presented the findings of a 
time and motion analysis of four of the sub- 
tests of the MacQuarrie Mechanical Ability 
Test. The purpose of the study was to de- 
termine whether time and motion analysis 
could assist in the interpretation of the factors 
being measured by the test. 

On the basis of the findings of this study 
the following conclusions might be warranted: 

1. Correlations computed on the basis of 
overlapping time and motion elements give, in 
four of the six cases, 7’s closely similar to their 
Pearsonian r’s. Deviations in four cases range 
from .00 to .09. The fifth and sixth deviations 
were .21 and .23. It would be mere speculation 
on the part of the writer to explain these larger 
deviations. 

2. There is close agreement in the analysis of 
the overlapping elements as produced by the 
factor analysis and by the time and motion 
analysis. The factor analysis identified a 
space factor, and a manual movement factor. 
The time and motion analysis revealed a visual 
inspection factor and a controlled manual 
movement factor. The space factor, which 
appears to be more psychological in nature, was 
not revealed by the time and motion analysis. 

3. Time and motion analysis technique 
appears to have the possibility of being a valu- 
able adjunct for gaining analytical insight into 
psychological tasks involving manual move- 
ments. 


Received April 16, 1949. 


— 


ee in 


DERI SORE ri 


SI GE OK AD FRIES 








The Pre-Engineering Inventory as a Predictor of Success in 
Engineering Colleges * 


Frederic Lord and John T. Cowles 
Educational Testing Service 


and 


Manuel Cynamon 
Brooklyn College 


The Pre-Engineering Inventory is a battery 
of seven objective tests, designed primarily to 
assist in the selection of those students who 
will be most likely to succeed in engineering 
schools.' This report summarizes a group of 
related studies on the reliability of this battery 
of tests and on their predictive efficiency, as 
evidenced particularly by correlations of the 
test scores with various measures of engineering 
school achievement. 

The Pre-Engineering Inventory is intended 
for use both as an instrument for selecting 
entering students and for the guidance of 
students before or during the first two years of 
undergraduate engineering studies. It is the 
purpose of the Pre-Engineering Inventory to 
supply measures of only those aptitudes or 
background factors which lend themselves to 
reliable measurement by means of written tests. 
Admissions and guidance officers are advised 
to use Pre-Engineering Inventory test scores in 
conjunction with other information about the 
student obtained from previous academic rec- 


* The authors have been aided in the preparation of 
this article by the valuable suggestions and criticisms 
of Dr. A. Pemberton Johnson, Dr. William G. Mollen- 
kopf, Dr. William B. Schrader, and Dr. Ledyard R 
Tucker. We especially wish to acknowledge the able 
editorial work of Mrs. Mabel K. Rugg. 

' The Inventory is one of several related groups of 
tests now in use or in process of development as part of 
the Measurement and Guidance Project in Engineering 
Education. This Project, initiated in 1943, is spon- 
sored by the American Society for Engineering Educa- 
tion and the Engineers’ Council for Professional Devel- 
opment. The testing and research activities of the 
Project were transferred from the Graduate Record 
Office to the Educational Testing Service in January 
1948 at the time the latter organization was established. 
An advisory council representing the two sponsoring 
societies meets periodically with representatives of the 
Educational Testing Service in order to review and 
guide the policies and progress of this Project. 


ords, recommendations by teachers, and per- 
sonal interviews. 


Description of Tests and Scores 


The present form of the Pre-Engineering 
Inventory, known as Revised Form A, was first 
made available in 1944, and has been used in 
two principal types of testing programs: a 
national program administered on fixed dates 
at widespread testing centers, and an institu- 
tional program administered on varying dates 
by cooperating institutions. The national 
program was designed for testing applicants 
for entrance to engineering schools, the institu- 
tional program for testing students already 
enrolled in those institutions. 

The difficulty level of the tests of the Pre- 
Engineering Inventory has been adjusted to the 
range of capacity of a base sample of pre-war 
freshmen in a broadly representative group of 
accredited engineering schools; the present 
form of the Inventory is not intended to be used 
in high schools or liberal arts colleges. 

The tests of the Pre-Engineering Inventory 
are contained in two booklets, with separate 
answer sheets adapted to machine or hand 
scoring. The questions are of the objective, 
multiple-choice type and include passages re- 
quiring interpretation, problems for solution, 
diagrams to comprehend, and questions on 
specific information. Emphasis is placed on 
essential aptitudes, skills, fundamental infor- 
mation, and the discovery or application of 
principles, rather than on mere factual memory 
or achievement in specific courses of instruc- 
tion. The seven tests included in the battery 


¢ come eee Sele 





Pre-Engineering Inventory as Predictor of Success 


Table 1 


Reliabilities of Raw Scores on the Pre-Engineering Inventory Tests for Five Schools Participating in the 
Fall 1947 ytrciver.. and for a Random Sample from the April 1948 National bene 


Tests 

. General Verba! 
Ability (100 items) 

. Technical Verbal 
Ability (90 items) 

. Ability to Comprehend 
Scientific Materials 
(100 items) 

. General Mathematical 
Ability (90 items) 

. Ability to Comprehend 
Mechanica! Principles 
(52 items) 

. Spatial Visualizing 
Ability (56 iterns) 

. Understanding of 
Modern Society 
(70 items) 

Number of Cases 


are:* Test 1, General Verbal Ability; Test II, 
Technical Verbal Ability; Test III, Ability to 
Comprehend Scientific Materials; Test IV, 
General Mathematical Ability; Test V, Ability 
to Comprehend Mechanical Principles; Test 
VI, Spatial Visualizing Ability; and Test VII, 
Understanding of Modern Society. 

In addition to the separate scores for the 
seven tests of the Inventory, an eighth score, the 
Com posile Score, is obtained by adding together 
the raw scores of Tests II, IIT, and IV. This 
Composite Score represents a general verbal 
and quantitative aptitude score. Preliminary 
studies on experimental forms of the test indi- 
cated that, for practical purposes, this Com pos- 
ile Score was the best index obtainable from 
the entire battery of the candidate’s general 
aptitude for engineering study; it was one of 
the purposes of the present study to reexamine 
this preliminary finding concerning the Compos- 
ile Score. 


* See K. W. Vaughn, The Pre-Engineering Inventory, 
J. Engng. Educ., 1944, 34, 615-625, for a description of 
the separate tests and the criteria by which they were 
chosen. 


National 
Program 


94 
92 


88 


Reliability of the Tests 


Reliability coefficients are given in Table 1 
for a random sample group taken from the 
April 1948 nationwide testing of applicants to 
engineering schools, and also for recently en- 
rolled freshmen students in five schools where 
testing was carried out during the fall of 1947. 
These coefficients are based on a division of 
each test into “rational halves.” In making 
this division any group of items based on a 
single paragraph of reading matter or on a 
single chart or table was treated as an indi- 
visible unit. An attempt was made to match 
the halves, in so far as possible, with respect to 
item difficulty, subject matter, item type, and 
position in the test. 

The reliability coefficients were obtained 
from the correlation between half-test scores 
by the application of the Spearman-Brown 
Formula (modified so as to take into account 
the fact that the two “halves’”’ of the test did 
not necessarily contain equal numbers of 
items). Since the number of groups of items 
in a single test was not always large, it was 
often impossible to achieve satisfactory match- 





AERC TA Re 








32 Frederic Lord, John T. Cowles, and Manuel Cynamon 


ing of the test items. As a result, the relia- 
bility coefficients presented probably tend to 
be underestimates of the actual test reliabili- 
ties in many cases. 

Test V (Ability to Comprehend Mechanical 
Principles) and Test VII (Understanding of 
Modern Society) do not have as consistently 
high reliabilities as would be desired. The 
level of reliability of the other five tests appears 
to be sufficiently high to justify a good measure 
of confidence in the scores. It may be stated 
that there is a strong tendency in these data 
for a test to have the highest reliability coef- 
ficients in those groups having the highest mean 
or the greatest variability of scores. (It would 
be desirable both in Table 1 and in most 
of the subsequent tables to present the mean 
and standard deviation of each of the different 
groups studied on each of the eight Inventory 
scores, but space considerations make it unde- 
sirable to present all relevant data. The 
schools and other groups studied here and 
throughout this report differ very greatly 
among each other with respect to mean score 
and with respect to variability of scores on 
any one test.) 

Since the Composite Score is used much more 
than the score on any single test, its reliability 
is of greater importance than those presented in 
the table. The Composite Score reliability for 
the 366 cases in the April 1948 National Pro- 
gram group is 0.95. This is a highly satis- 
factory value. The comparable coefficients for 
the different school groups in Table 1 have not 
been calculated, but the general level of the 
test reliabilities indicates that the Composite 
Score reliabilities for the several schools could 
be expected to fluctuate in the neighborhood 
of 0.95. 


Twelve-School Study" 


Studies were made of the correlations of 
Inventory scores with achievement records and 
other relevant data for the freshman and, in 
some cases, the sophomore classes in twelve 


* This twelve-school study was planned by K. W. 
Vaughn as Director of the Graduate Record Office and 
was carried through largely under.his direction. For 
earlier validity figures see K. W. Vaughn. Basic con- 
siderations in a program of freshman evaluation. 
J. Engng. Educ., 1944, 35, 161-179 


engineering colleges.‘ The test scores for these 
students had been obtained by testing the en- 
tire freshman class shortly after enrollment. 
The names of the institutions are as follows: 
California Institute of Technology; Carnegie 
Institute of Technology; Columbia University; 
Georgia School of Technology; Massachusetts 
Institute of Technology; Newark College of 
Engineering; North Carolina State College of 
Agriculture and Engineering; Oklahoma Agri- 
cultural and Mechanical College; Oregon State 
College; University of California at Los 
Angeles; University of Michigan; and Uni- 
versity of Texas. In two cases, two different 
groups of students from the same school were 
analyzed separately because they took dis- 
similar courses, so that fourteen, rather than 
twelve, “school groups” are discussed in the 
present report. In. reporting results the iden- 
tity of the schools will not be disclosed. Dates 
of testing—which cover the period from 1944 to 
1946—and also numbers of individuals will be 
presented with the other data in the tables 
which follow. 

Care should be taken in considering the cor- 
relation or lack of correlation of the Pre- 
Engineering Inventory scores with school grades. 
A score on one of the Inventory tests may have 
little usefulness for predicting average school 
grades but may nevertheless prove to be very 
useful for guidance or for selectitig individuals 
with good potentialities for success after gradu- 
ation from engineering school. The Under- 
standing of Modern Society test, for example, 
should not be expected to correlate highly with 
engineering school average grades. 

Another fact that must be taken into ac- 
count is that school grades are rarely as statis- 
tically reliable as the scores on objective tests. 
Hence correlations with school grades often 
present attenuated estimates of relationship. 
Moreover the necessity of utilizing only those 
students who actually were admitted to engi- 
neering school and persisted through one or 
more semesters of study limits the range of tal- 
ent to an upper range, as compared with the 
total range of talent in the group from which 


*A year after the study reported here was begun, 
data on the academic achievement of students tested as 
freshmen were obtained from twenty-two additional 
engineering colleges. These data have not been ana- 
lyzed to date 





Pre-Engineering Inventory as Predicior of Success 


selection was originally made. This definitely 
reduces the amount of correlation which is ob- 
tained, particularly if the degree of selection 
is great on those characteristics measured by 
the selection test. The present study, there- 
fore, is limited to demonstrating how well the 
Pre-Engineering Inventory correlates with the 
engineering school grades of students already 
admitted to engineering school. It does not 
demonstrate how well the Pre-Engineering In- 
ventory selects from among all engineering 
applicants those who will obtain the best engi- 
neering school grades, nor how well the In- 
ventory would predict an ideal measure of 
engineering school success above and beyond 
course grades, including the examinee’s later 
work as a practicing engineer. 

1. Correlations of the Test Scores with Aver- 
age Grades. One of the main purposes of this 
study was to determine the validity of the Com- 
posite Score for predicting engineering school 
success as measured by the various average 
grades available for enrolled students. The 


33 


Composite Score validities were obtained by 
computing the correlations between the Compo- 
site Score and the various available averages of 
course grades. The correlations between scores 
on the individual: Pre-Engineering Inventory 
tests and average grades were also obtained in 
order to determine how scores on the separate 
tests might be related to engineering school suc- 
cess. These latter correlations should not be 
considered as validity coefficients since it is not 
intended that any single test score should be 
used to predict average grades in engineering 
school. 

The correlations of each test score and of 
the Composite Score with average first-term 
grades are presented in Table 2; the Composite 
Score validities for school groups having more 
than one term of achievement records are pre- 
sented in Table 3. It will be noted that Table 
2 shows a median correlation of 0.60 between 
Composite Score and average first-term grades, 
a very satisfactory value. Except where 
otherwise indicated by footnotes, all correla- 


Table 2 


Correlations of Pre-Engineering Inventory Scores with Average First-Term Grades 
(July 1944 to September 1946 Testings) 





Test I 
Gen- 
eral 
Verbal! 
Ability] 


Test II 
Tech- 
nical 
Verbal 
Ability 


Test III 
Compre- 
hension 
Scientific 
Materials 


Test VIL Com- 
Under- posite 
standing Score 
Modern (II+III 
Society +IV) 


Test VI 
Spatial 
Visual- 
izing 
Ability 


Test IV 
General 


Test V 
Compre- 
Mathe- hension 
matical Mechanical 
Ability _ Principles 





55 A 
52 65 
AT 53 
50 
A3 
.52 
48 
46 
54 
31 
30 
. 38 
M 25 
N ‘ 29 


eke 


rhe 


-_-—— 
_-_ uu 


w 
- 


Median 3: A8 


_ 


Y Wie 
53 67 
Al 67 


58 38 A2 
.67 55 35 
65 50 a 
63 55 36 —- .66 
71 ae 42 Me 65 
63 40 32 46 61 
58 35 35 61 
58 A2 37 . 59 
A6 — 58 
&— 33 36 Jl 
59 36 30 t SO 
A2 37 .28 a 48 
51 30 22 4 44 
38 aa = 38 


58 37 35 


oo 


* There is more than 1 chance in 100, but no more than 1 chance in 20, that a correlation as large as this could 


arise solely from sampling fluctuations. 


** There is more than 1 chance in 20 that a correlation as large as this could arise solely from sampling fluc- 


tuations. 
*** This test was not given at this school. 











Frederic Lord, John T. Cowles, and Manuel Cynamon 


Table 3 


Validity of Composite Score for School Groups for Which Achievement Records Covering More 
Than One Term Were Available 





Ist-Term 


Valid- No. of 


ity Cases 


No. 


(July 1944 to November 1945 Testings) 


Average Grades 
2nd-Term 

of 
Cases 


Two-Term 


Valid- 


ity 


T hree-Term 


Valid- No. of 
ity Cases 


No. of 
Cases 





68 (285) 
67 (176) 
66 (403) 
61 (391) 
(228) 
(84) 
(333) 
(195) 


9 
50 
A8 
44 


tions in these tables are significant at the one 
per cent level—there is less than one chance 
in one hundred that correlations as large as 
these could arise solely from sampling fluctua- 
tions. The Composite Score validities are with- 
out exception significant in this sense. 

Within a number of the school groups there 
was great variability in the course of study 
taken by different students—a situation that 
inevitably reduces the Composite Score validity 
coefficients, since the average grades do not 


(208) 


(338) 


65 
70 
53 
50 
58 
63 
.50 


(208) 

(84) 
(338) 
(293) 
(195) 

(68) 
(274) 





have a consistent meaning from student to 
student. Another situation that reduces the 
validity of the Composite Score in some groups 
is the inclusion in the grade average of such 
subjects as foreign languages, military science, 
social sciences, ethics, Bible, music, and so 
forth. 

As would be expected, investigation reveals 
a tendency for school groups where such situa- 
tions exist to have Composile Score validities 
below the median validity for all schools. 


Table 4 


Median Pre-Engineering Inventory Test Intercorrelations for Fourteen School Groups 


II 
63 


Ill 


. General Verbal 63 
Ability 
. Technical Verbal 
Ability 
. Ability to Comprehend 
Scientific Materials 
’. General Mathematical 
Ability 
. Ability to Comprehend 
Mechanical Principles 
‘I, Spatial Visualizing 
Ability 
. Understanding of 
Modern Society 
Composite Score 
(11+11I+TV) 


Composite 
Score 
VIE (1T+III+IV) 





apeiiow sites 4 cag ft 





Pre-Engineering Inventory as Predictor of Success 35 


School groups M and N of Table 2—the two _ bility of the average grade and thus reduce the 
groups with the lowest validities—are both Composiie Score validity for this group. 
characterized by great variability in the course Of the groups with first-term validities be- 
of study pursued by different students. School low the median, four (H, K, L, M) have Com- 
group L and school group M both consist of two _posife Score validities for more than one term 
or more distinct groups of students matricu- of work. In school group H the cumulative 
lating at different times of year, and this fact two-term validity is very slightly lower than 
probably accounts in part for the low validities the first-term validity. In the other three 
obtained for these groups. Group K was a_ school groups the validities for predicting the 
summer-school group that did not take a full two-term averages are higher than those for 
term’s work during their first “term” of study _ predicting the first-term averages. 

~a fact that would tend to reduce the relia- 2. The Test Intercorrelations. Relation- 


Table 5 
Comparison of Validity Coefficients Obtained for Optimally Weighted Averages of Various Tests with the 
Coefficients Obtained for a Single Pre-Engineering Inventory Score 


Optimal Weights** 


Understanding 


Modern Society 


Ability to Com- 


prehend Mechanical 


Genera! Verbal 
Principles 


Ability 


2. Technical Verbal 


Ability 


prehend Scientific 


Materials 
4. General Mathe- 


Scores Used for 
matical Ability 
6. Spatial Visual- 
izing Ability 


Prediction 
Coefficients* 


Validity 


5. 
2 
fF 


1 
3. Ability to Com- 


3 
Composite 
2, 3,4 

2, 3,4, 7 
All tests 


Z 
A 
\ 
A 
A 
A 
A 


4 
Composite 
2, 3,4 

All tests 


—— 
~~ —— & 


3 
Composite 
2, 3,4 

All tests 


al 
= 


4 

Composite 

2,3,4 

M 2, 
1, 
1, 


cd 
= 


3,4, 7 5! . 02 44 ; AS 
M 4,: : 05 57 — .08 
M 4,5,7 a —.12 55 — 15 - .28 
M All tests 5 —.19 10 F A ~ 21 03 24 


* The group of students on which Table 2 is based is not in every case exactly the same as the group used in 
the multiple correlation study. This fact causes a slight discrepancy between the correlations obtained in some 
cases. 

** The weights given are multiple regression weights for the case when the standard deviations of the test 
scores have been equalized 

*** School D did not administer Understanding of Modern Society 


Oa trae i He Bt sere re ates on, 








36 Frederic Lord, John T. Cowles, and Manuel C ynamon 


ships among the individual tests are shown in 
Table 4, which presents for each possible pair 
of tests the median of the fourteen separate 
intercorrelations obtained for the fourteen 
separate school groups. All the median corre- 
lations are of such magnitude that they may be 
taken to represent real relationships among 
the tests. 


Multiple Correlation Study 


The three tests included in the Composi/e 
Score—Technical Verbal Ability, Ability to Com- 
prehend Scientific Materials, and General Mathe- 
matical Ability—-are in general the single tests 
most predictive of engineering college grades. 
This may be seen by reference to Table 2 in 
which the median of the correlations of each 
test with first-term average grades is presented. 
However this is not true for every group. It 
can be seen that for some schools the General 
Verbal Ability, Ability to Comprehend Mechan- 
ical Principles, and Understanding of Modern 
Sociely scores had higher correlations with first- 
term average grades than did Technical Verbal 
Ability. This raises the question as to whether 
or not some alternative combination of tests 
would provide a better composite score for 
predicting average grades. 

It was further found in Table 2 that in 
five of the fourteen school groups the corre- 
lation with first-term average grades 
actually lower for Composite Score than for 
General Mathematical Ability. In two addi- 
tional groups the correlation with average 
grades was at the same level for both Com- 
posite Score and General Mathematical Ability. 
For these groups some other weighted average 
of test scores would undoubtedly have provided 
better prediction of average grades than did 
the Composite Score. 

A multiple correlation study was therefore 
undertaken to obtain evidence us to whether 
or not some other weighted average of Pre 
Engineering Inventory scores would be pref 


was 


erable to the Composile Score now in use. 
Four schools, each with a large number of 
students, were selected for study so as to be 
as representative as possible of the schools for 
which data were available. 
it was determined what should be 
assigned to each of the Pre-Engineering In 


For each school 
weights 


ventory test scores in order that the weighted 
average of all the tests should provide the 
mathematically best possible prediction of 
average grades. The correlation of average 
grades with this optimally weighted average of 
all tests was computed as a measure of the 
effectiveness of the best prediction obtainable 
(see Table 5). This multiple correlation may 
be considered to be a validity coefficient. 

As an example of how Table 5 may be in- 
terpreted, it is seen that in School A the opti- 
mum weights for Tests IT, III, IV, and VII are 
in the ratio of 12: 29:24:17. The corresponding 
weighted average of these tests correlates 0.696 
with average grade, as compared to the com- 
parable value of 0.678 for Composite Score. 

In three of these schools it was found that 
the Composite Score functioned almost as well 
for predicting first-term average grade as 
would the optimally weighted average of 
all Pre-Engineering Inventory scores. In the 
fourth school the Composite Score did not 
function at all adequately from this point of 
view. Two facts about this school may be 
relevant here: first, practically none of the 
students at this school took English, or any 
related subject, during the first term; and 
second, the group studied in this school was 
composed of several entering freshman classes 
that had been combined in order to obtain an 
adequate number of cases for the statistical 
analysis. 

The weights obtained in Table 5 apply for 
the case when the scores on the tests have been 
adjusted so that all scores have equal standard 


Table 6 


Optimal Relative Weights * for Raw Scores on Three 
Pre-Engineering Inventory Tests for the Pur 
pose of Predicting Average Grade 


Ability to 
Comprehend 
Verbal Scientific 
School Ability Materials 
\ 0.5 1.3 
D 0.7 0.6 
L 0.7 1.3 
M 0.2 0.7 


General 
Mathe 
matical 
Ability 


Technical 


* These relative weights are constructed so that their 
sum is 3.0, so as to facilitate comparison with the 
weights used to obtain the Composite Score, which are 
1.0:1.0:1.0 





Pre-Engineering Inveniory as Predicior of Success 


Table 7 


Correlations Between Pre-Engineering Inventory Scores and Course Grades 
(403 Students tested July 1944) 


First- 
Term 
Average 


Tests* Grade 


1. General Verbal 39 
Ability 
Il. Technical Verbal 
Ability 
. Ability to Comprehend 
Scientific Materials 
’, General Mathematica! 
Ability 
. Ability to Comprehend 
Mechanical Principles 
‘I. Spatial Visualizing 
Ability 
Composite Score 
(11+11I+IV) 


* Test VII was not administered at this institution. 


deviations. Table 6 presents the optimal rela- 
tive weights for application to actual raw 
scores. The data shown in Table 6 for the 
four schools studied give some indication that 
better prediction of first-term grades might be 
obtained if Technical Verbal Ability were less 
heavily weighted in obtaining the Composite 
Score. The improvement of prediction so ob- 
tained would probably be of little, if any, 
practical importance in the case of most schools. 
A few of the schools, however, would probably 
benefit appreciably if they could construct 
their own composite score in accordance with 
their local needs. Obviously no single compo- 
site score will provide optimum prediction for 
every school. 


Validity for Predicting Specific Course Grades® 

The correlations between the Pre-Engineer- 
ing Inventory scores and first-term freshman 
grades in individual courses have been obtained 
for one of the larger colleges of engineering. 
The records of 403 students who took an 
identical course of study were available for 
this institution. Table 7 presents the correla- 
tions of the various test scores and of the Com- 


5 This study was pew by and carried through 


under the direction of K. W. Vaughn as Director of the 
Graduate Record Office. 


Souk 183) wR 4 He 
ake y ES Seis 


Calculus 


Course Graces 


Engineering 


Chemistry Physics Drawing English 


37 32 15 3 


AS 19 


posite Score with grades in individual college 
subjects and with term averages. All correla 
tions are statistically significant at the one per 
cent level. 


Validity for Predicting Achievement at 
the Sophomore Level 

The efficiency of the Pre-Engineering In 
venios v tests for predicting engineering school 
achievement has also been investigated by ob- 
taining the correlations between the Pre- 
Engineering Inventory tests and the various 
Engineering Achievement Tests. The Engi 
neering Achievement Tests are designed to meas 
ure achievement in several specific branches 
of the beginning engineering curriculum. The 
seven tests in this battery measure the stu 
dents’ knowledge of fundamental engineering 
principles and terminology and their applica 
tion to the solution of specific problems. The 
Achievement Tests are available to colleges of 
engineering participating in the Measurement 
and Guidance Project in Engineering Educa 
tion and are used primarily in the examination 
of regularly enrolled sophomore students at or 
near the end of the academic year. 

Data were available for 430 students in one 
school who took the Pre-Engineering Inventory 
in September 1946 and the Engineering A chieve- 








Frederic Lord, John T. Cowles, and Manuel Cynamon 


Table 8 


Correlations Between Pre-Engineering Inventory Scaled Scores and Engineering Achievement Test 
(Form B) Raw Scores 
(236 Students at School Q Tested in October 1946 and May 1948, and 430 Students 
at School R Tested in September 1946 and May 1948) 


Pre-Engineering 
Inventory 
Tests 


English 
Expres- 
School sion 
I. Genera! Verbal Q 70 32 
Ability kK 58 36 
Il. Technical Verbal Q 52 Al 
Ability k 37 42 
III. Ability toComprehend Q 58 52 
Scientific Materials k | 47 
IV. General Mathematical ; 52 35 
Ability 39 37 


V. Ability toComprehend = (Q 43 54 
Mechanical Principles 30 46 


Engi- 
neering 


VI. Spatial Visualizing 0 36 52 
Ability A8 32 

VIL. Understanding of G | 32 
Modern Society 48 33 


Composite Score 62 
(11+ C11+1V) 51 50 


* Not administered. 


ment Tests in May 1948, and for 236 students in 
a second school who took the /mveniory in 
October 1946 and the Achievement Tests in 
May 1948. Thecorrelations appear in Table 8. 

On the whole the results are quite good in 
view of the fact that a lapse of time as great as 
two years intervenes between the two testings. 
Even the lowest correlations in this table are 
significant at the one per cent level. The most 
striking thing about the data presented in 
Table 8 is that the Composile Score is the 
best or second best predictor of achievement, 
as measured by the Engineering Achievement 
Tests, in almost all of the several areas of 
engineering education for both school groups. 
In those cases in which the Composile Score is 
not the best predictor, one of the tests com- 
prising the Com posife Score is the best predictor 
of achievement, with few exceptions. 


Engineering Achie 


Drawing 


vement Tests 
Elec- 

tricity Heat, 
and Light, 
Mag- and 

netism Sound 


General 
Chem- 
istry 


Physics 


Mathe- (Mechan- 


46 3! mote oe 40 
33 : , 


66 a ‘ 44 
AS 
59 


49 


ww 
” 


~ ¢ 


51 
Al 
49 
38 
33 
4 


46 
37 


ww 
uw 


or = 


nin Be be & 
“2% S& G& 


4 


This study further substantiates findings 
that the Pre-Engineering Inventory in general 
and the Composite Score in particular are valid 
predictors of engineering school success. 


General Conclusions 


The results of the several test analyses de- 
scribed above indicate that the Pre-Engineering 
Inventory Composite Score on the whole per- 
forms very satisfactorily as a predictor of engi- 
neering school success. Of course, any fixed 
set of weights will not provide optimum predic- 
tion for every school. The use of a specially 
weighted combination of those tests that com- 
prise the Composile Score, or possibly of some 
other tests, would undoubtedly appreciably 
improve the prediction in some schools. How- 





Pre-Engineering Inventory as Predictor of Success 39 


ever, it is not certain that the advantages 
gained by computing a special composite score 
would justify the expense involved. 

An examination of the correlations pre- 
sented here for the Pre-Engineering Inventory 
tests not included in the Composile Score is of 
value for understanding these tests and their 
relation to the engineering school curriculum. 
The utility of these tests cannot be unequiv- 


eye goss pas : 


ocally evaluated by the approach used in the 
studies covered here,: since these tests have 
functions beyond the prediction of college 
grades, particularly as aids in the guidance of 
the student and in the selection of broadly 
gifted individuals who will do credit to the 
profession after graduation. 


Received November 18, 1049 
Early publication 








The Kuder Literary Scale as Related to Achievement in 
College English * 


A. Kimball Romney 


University of Wisconsin 


The relation between scores obtained on 
different types of interest inventories and 
achievement has been summarized and ap- 
praised elsewhere.' Most of the studies which 
examined the relation between Kuder interests 
and achievement showed positive relation- 
ships.* It can be added that the studies were 
done in the main with rather small groups 
using, most frequently, college grades as a 
criterion of achievement. Whether results 
from larger groups using more refined methods 
would give different results has not been known. 

The present note is concerned with report- 
ing results obtamed using a precise index of 
achievement on a fairly homogeneous group 
of over a thousand subjects, 

Specifically it concerns correlation data be- 
tween the Kuder literary score, ACE scores 
(American Council on Education Psychological 
Examination for College Freshmen, 1947 edi- 
tion), and achievement in college English 
classes. It is the nature of the achievement 
score that makes these data of special interest. 
They are noteworthy for the following reasons: 
(a) the group of subjects is composed of 1085 
(566 male, 519 female) freshman students who 
took English 1 Autumn Quarter at Brigham 
Young University in 1947; (b) since it included 
all new freshmen the results are somewhat free 
of many errors that might have been introduced 
by sampling procedures; (c) the subjects were 
all exposed to as near the same amount of 
instruction as possible during the quarter; (d) 
the achievement is judged not on the basis of a 
grade (as “A”, “‘B’’, etc.) but rather on the 
basis of a long (554 item), objective, carefully 
administered achievement test given at the 

* The author is grateful to Dr. Antone K. Romney, 
Head of the Counseling Service at Brigham Young 
University, for permission to collect the data reported 
here. The pea do not necessarily reflect the policy 
or the nature of the work done by the Counseling 
Service. 

‘Super, D. E. Appraising vocational filness 


York: Harper and Brothers. 1949. Pp. 727. 
2Super, D. E. Op. cit., pp. 457-458. 


New 


end of the fall quarter; and (e) college aptitude 
as measured by the ACE is taken into account 
as an important variable. 


The English Achievement Examination 


The test, which covered all the areas taught 
in English 1 during Autumn Quarter 1947, was 
prepared by a faculty committee before the 
beginning of the quarter. It consisted of 554 
items arranged into five sections: vocabulary, 
essays, short stories, grammar, and miscella- 
neous. At the beginning of the quarter all 
English teachers recieved a letter from the 
department head indicating the areas of the 
test, amount of time which would be allowed 
for each area, and when it would be given. 
This information was given to the students at 
the same time. Similar letters were sent out 
at mid-term and three weeks before the end of 
the quarter. No teacher knew the exact ques- 
tions and none of the teachers who taught in 
the experiment had seen the test before it was 
administered. 

At the beginning of the quarter the students 
were divided into ability areas on the basis of 
the Purdue English Placement Test. ‘Teachers 
were assigned to teach the respective groups, 
ie., each taught some groups classified as 
“poor,’’ some classified as “intermediate,” and 
some groups classified as “‘good.”” This scat- 
tered the teachers over the entire classification 
of groups equalizing the teaching factor as 
nearly as possible. All teachers used the same 
text and the same units of work for all classes. 

At the close of the quarter, qualified 
teachers from the university were called in by 
the English department and were given in- 
structions as to the administration of the 
achievement test and it was administered to all 
students the same afternoon under the same 
conditions. The teachers and tutors who 
administered it had all received special training 
for that purpose and the test, which was under 





The Kuder Literary Scale and Achievement 


the supervision of the Counseling Service, was 
scored on International Business Machine 
equipment. 


Results 


The various correlation coefficients ob- 
tained are shown in Table 1. 
It can be seen that the Kuder literary scale 


has a small but statistically significant correla- 


tion with English achievement which is roughly - 


.3 for both male and female. This does not 
rise significantly when possible effects of ACE 
are partialled:out. 

The correlation between ACE and English 
achievement was high for both males and 
females, .69 and .84 respectively.’ 

With reference to the multiple R, it can be 
seen that no significant changes result by the 
addition of the Kuder literary scale. It is to 
be noted that the Kuder scale and the ACE 
are apparently quite independent of each other. 

It can be concluded that as far as these 
data are concerned the correlation between 


* No explanation is offered for the difference in coeth 
cients observed between male and female scores 


Table 1 
Correlation Coefficients between English Achievement, 
Kuder Literary Scale, and ACE 


Independent Variables 


Kuder 
with 
ACE (b) 


Constant 


Kuder 
Plus 
ACE (a) 


Dependent 
Variable 


Kuder 
English 
Achievement 

Male 

N = 566 69+.02 


714.02 284.04 


English 
Achievemen 
Female 

(N=519 294.04 844.01 854.01 294.04 
(a) Multiple correlation 


(b) Partial correlation. 


achievement in a college English class and the 
Kuder literary scale is very low, even though 
statistically significant. 
Received October 5, 1949 

Early publication 


2a PIR EN RN Ae nD a Rate OR 








Scores on the Strong Vocational Interest Blank and the Kuder 
Preference Record in Relation to Self Ratings 


Ralph F. Berdie 


Student Counseling Bureau, Office of the Dean of Students, University of Minnesota 


Expressed interests and measured interests 
are far from identical. In terms of common 
elements, perhaps no more than between 25 
and 50 per cent of the factors associated with 
one are associated with the other. Various 
studies report correlation coefficients ranging 
about .50 between measured and expressed 
interests, depending upon the test used, the 
method employed for eliciting and classifying 
expressed interests and the sample. 

In a study of the Strong Vocational Interest 
Blank scores of 1000 men who came to a Uni- 
versity Testing Bureau, Darley (5) reported 
contingency coefficients between claimed voca- 
tional choices and interest score patterns 
ranging from .35 57. He concluded, 
“Claimed vocational choices cannot be sub- 
stituted for measured interests in effective 
counseling.” 

Lalegar (11) also used the Strong Blank, 
along with the Manson Occupational Blank, 
to determine the relationship between the test 
scores and the occupational choices of 703 
eleventh grade girls, an age group for which 
the Strong Blank is not too appropriate. She 
concluded; “Insofar as stated choice of occupa- 
tion by groups of individuals may be considered 
a true criterion of interest, the lack of relation- 
ship between the statement of occupational 
choice and interest scores or letter ratings may 
be considered evidence of the lack of validity 
of the interest inventory.” As one must dis- 
agree with Lalegar’s assumption regarding the 
“true criterion’’ of interests, so must one also 
reject her conclusion that vocational interest 
inventory scores cannot contribute materially 
to vocational planning. 

Kopp and Tussing (9) report the relation- 
ship between scores on the Kuder Preference 
Record and responses to a simple questionnaire 
used for appraising vocational interests. For 
each student, the nine scores on the Kuder 
Record were ranked in order from one to nine, 


to 


42 


and these rankings were correlated with the 
orders in which the occupations were ranked 
by the students. The authors do not report 
how these correlation coefficients were ob- 
tained, but they state that for 115 boys in 
grade 10B, the correlation between question- 
naires and Kudgr scores was .59 and for 117 
girls in grade 108, the correlation was .50. 

Rose (14) gave the Kuder blank to 60 “‘un- 
selected” veterans referred to a Vocational 
Advisement Unit. Each veteran was then 
given nine cards. ‘On each card were 17 occu- 
pations selected from Kuder’s manual (10) as 
characteristic of a given area. The subjects 
ranked the nine areas according to preference 
for the occupation. The coefficient of contin- 
gency obtained was .61. The rank order cor- 
relation coefficients for each of the 60 cases 
ranged from —.05 to .99, with a median rho 
of .64. 

Crosby and Winsor (4) administered the 
Kuder Preference Record to 222 men and 
women sophomores in a college of agriculture 
and home economics. These students, after 
an explanation of the type of interests meas- 
ured by the test, were asked to estimate their 
percentile scores, using the Kuder profile 
sheets. The correlation coefficients between 
obtained scores and estimated scores on the 
seven scales ranged from .39 to .66. The 
median r was .54 

Information is available about the relation- 
ship between the Kuder and the Strong tests 
(18), but no study reported to date allows 
direct comparisons between the relationship 
of the Kuder Record and self ratings and the 
Strong Blank and self ratings. Paterson (13), 
reporting the case of a returning veteran who 
took both tests, raises a question concerning 
the relative ease of “faking” the tests. He 
suggests: ‘The Kuder and Strong inventories 
both yield important information about an 
individual’s interest patterns, when obtained 





Scores on the Strong Vocational Interest Blank 43 


in a guidance situation. However, in a selec- 
tion situation, it would.appear that the Strong 
is to be preferred because it is more subtle 
and the vocational significance of liking or 
disliking each of the 400 items is not so readily 
apparent to the person taking the test.” 

Longstaff (12) reports an investigation of 
the fakability of the Strong test and the Kuder 
test where 35 men and 24 women took the 
Strong test, men’s form, and 37 men and 22 
women took the Kuder test. The subjects 
first took the tests under “normal” conditions 
and then took them with directions to raise 
their scores on certain scales and lower them on 
others. The results indicated both tests could 
be faked and that some scales were more 
fakable than others. The Strong test was 
easier to fake upwards, the Kuder test easier 
to fake downward. 

A theory of vocational interests with partic- 
ular reference to the Strong Blank has been 


Kuder Preference Record and self ratings, 
using the same sample. Does the picture a 
person has of his own interests correspond 
more closely tu the picture given by the Strong 
Blank or to that given by the Kuder Record? 


Method 


Each man who came to the Student Coun- 
seling | .ceau of the University of Minnesota 
during the urst part of 1948 and who was to 
take an interest test was given the Strong test, 
the Kuder test and a self rating form. 

Che rating form was-of the graphic rating 
type and covered the following nine occupa- 
(1) biological sciences; (2) artistic 
creation aid appreciation ; (3) physical sciences; 
(4) technical occupations; (5) social service; (6) 
musical occupations; (7) business detail; (8) 
selling; and (9) verbal or literary. For each 
area the graphic scale was i‘entical. The 
following is an example: 


Liohai areas: 


1. This occupational area centers about the 
biological sciences and includes medicine, 
dentistry and psychology. 


' 





My interests are 
very much unlike 
interests of people 
in this area 


Somewhat 
dissimilar 


discussed by Bordin (2). He presents the 
hypothesis that, “In answering a Strong Voca- 
tional Interest Test, an individual is expressing 
his acceptance of a particular view or concept 
of himself in terms of occupational stereo- 
types.” In support of this hypothesis, Bordin 
cites the relationship between claimed and 
measured interests and also the ability of sub- 
jects to manipulate or fake their vocational 
interest patterns. These phenomena can be 
explained making use of concepts other than 
those’ used by Bordin and we already have 
suggested here, in reference to the relationship 
between measured and claimed interests, the 
effect of common elements. 


Purpose 


The purpose of this investigation was to 
determine the relative agreement between 
scores on the Strong Vocational Interest Blank 
and self ratings and between scores on the 


No marked 
similarity or 
dissimilarity 


Somewhat 
similar 


My interests 

strongly resemble 

the interests of 

people in this area 
’ 


The order of presentation of the tests and 
the rating form was varied systematically so 
that one-third of the 500 men tested took the 
Strong test first, one-third took it as the second 
test and one-third took it as the last test in the 
series. Similarily, the order of presentation for 
the other test and rating form was rotated. 

The ages of the 500 men varied from 14 
years through 37 years, with a mean age of 20.8 
years, a median age of 20.6 years and a stand- 
ard deviation of 3.5 years. Only seven men 
were 16 years of age or younger, only 11 were 
30 years old or older. 

The largest single group, the pre-college 
group (N=195), consisted of px ple not yet 
registered within the University but planning 
on matriculating within one calendar year. 
The largest proportion of these people were 
completing or just had completed their senior 
year in high school. The non-college group 
(N= 19) consisted of people who were not in 
residence in the University and who were not 


EIB PY te OO NINEL 


Gurren 


Mpc bt TEEPE SRE TEN 


er eS 





44 Ralph F. Berdie 


planning on entering within the next year. students beginning their first year of college 
The students in the College of Science, Litera- to students completing their fourth year. 
ture and the Arts (N=126), the Institute of Similarly, students from the remaining colleges 
Technology (N=89), the College of Agricul- tended to come from all classes within those 
ture, Forestry and Home Economics (N=13), colleges. Most of these 500 students, how- 
the College of Education (N=3), and the ever, were people who had completed less than 
College of Pharmacy (N=2) ranged from two years of college. 


Table 1 


Categories Used in Classifying Scales of the Strong Blank, the Kuder Record and the Self-Rating Form with the 
Number of Subjects Having Different Types of Patterns or Scores in Each Area on Each Test 





Artistic 


Stron, Self-Rating Kuder 
(artist, architect) (item 2) (artistic scale) 








Type of Pattern Number Number Score Number 
primary pattern 13 : 55 75-100 120 
secondary pattern 15 136 48 
tertiary pattern 46 : 136 50-64 57 
no pattern 426 91 


82 








— Scientific—Biological 
(osteopath, physician, Self-Rating 
psychologist, dentist) (item 1) (scientific scale) 





Type of Pattern Number Score Number Score 





; Number 
primary pattern 47 5 79 75-100 114 


secondary pattern 38 4 193 65-74 46 

tertiary pattern oO 3 98 50-64 77 

no pattern 355 2 67 0-49 263 
1 63 





Strong Scientific—Physical 
(mathematician, physicist, Self-Rating Kuder 
chemist, engineer) (item 3) (scientific scale) 


Type of Pattern Number Se N Score Number 
primary pattern 27 : 75-100 114 
secondary pattern 7 65-74 46 
tertiary pattern 26 : 50-64 77 
no pattern 440 0-49 





Strong Technical 
(Farmer, aviator, carpenter, 

printer, M & P.S. teacher, Self-Rating Kuder 
policeman, forest service man) item 4) 


(mechanical scale) 


Type of Pattern Number Score Number Score Number 
primary pattern 150 - 98 75-100 97 
secondary pattern 72 169 65-74 47 
tertiary pattern 95 ; 111 50-64 52 
no pattern 183 62 


60 








Scores on the Strong Vocational Inierest Blank 


Strong 
(YMCA phys. dir., personnel 
dir., pub, admin., YMCA secy., 
social] science teacher, school 
supt., minister) 


Type of Pattern Number 
primary patter 71 
secondary pattern 57 
tertiary pattern 83 
no pattern 289 


Strong 
(Musician 


Type of Pattern Number 
primary pattern 74 
secondary pattern oo 


tertiary pattern 77 
no pattern 


Strong 
(CPA, accountant, office mgr., 
purch. agent, banker, mortician 


Type of Pattern Number 
primary pattern 81 
secondary pattern 78 
tertiary pattern 96 
no pattern 245 


Strong 
(Sales manager, real estate 
sales, life insurance sales) 


Type of Pattern Number 
primary pattern 128 
secondary pattern 94 
tertiary pattern 8Y 
no pattern 


Strong 
(Advertising, lawyer, 
author journalist) 


Type of Pattern Number 
primary pattern 45 
secondary pattern 51 
tertiary pattern 140 
no pattern 261 


Table 1 (Continued) 


Social Service 


Self-Rating 
(item 5) 


Score Number 
5 91 
4 173 
3 100 
2 73 
1 63 


Musical 


Self-Rating 
(item 6) 


Score Number 
5 51 
4 109 
3 104 
2 95 
1 


141 


Business Office (Clerical 


Self-Rating 
(item 7) 


Score Number 
5 70 
4 152 
3 118 
2 3 
1 


Sed 
‘a 


Sales 


Self-Rating 
(item 8) 


Score Number 
5 OF 
156 
123 
O4 
93 


Verbal—Literary 
Self-Rating 
(item 9) 
Number 
77 
157 
111 
87 
68 


Score 


Score 


75-100 


605-74 
50-64 
049 


Kuder 


(social service scale) 


Number 
246 
40 
63 


Kuder 


(musical scale) 


Score 


75-100 


65-74 
50-64 
0-49 


Score 
75-100 
65-74 
50-64 

0-49 


Score 
75-100 
65-74 
50-64 

049 


(Literary Scale) 


Kuder 


(Computational and Clerica! 


Scales) 


Number Number 


(compu- _(cleri- 
tational) cal) 
151 112 

26 29 

55 56 
268 303 


Kuder 


(Persuasive Scale) 


Number 
271 
46 
37 
146 


Kuder 


Number 

177 
55 
57 

211 








46 Ralph F. Berdte 


The 37 occupational scales of the Strong 
Blank were divided into nine groups, as shown 
in Table 1, and the 9 scales of the Kuder test 
were similarly classified. Two scales of the 
Kuder, computational and clerical, were clas- 
sified as Business-Office, and the Kuder Scien- 
tific scale was considered as being both ‘“Bio- 
logical science” and “Physical science.” ' 

Thus, for each student were available 37 
scores on the Strong, nine scores on the Kuder 
and nine scores on the self rating form. A 
pattern analysis (5) was made of the Strong 
scores, according to the groups in Table 1. If 
within a group, a plurality of the scores were 
A’s and B+’s, that was a primary pattern. 
If a plurality of scores were B+-'s and B’s, that 
was a secondary pattern. If a plurality of 
scores were B’s and B—’s, that was a tertiary 
pattern. All other groups were labeled “no 
pattern.” In case of the musician’s scale, an 
A was called a primary pattern, a B+, a 
secondary pattern, and a B, a tertiary pattern. 

A percentile score of 75 through 100 on the 
Kuder was called a primary pattern, a per- 
centile score of between 65, through 74, a 
secondary pattern and a percentile score of 
between 50 through 64 a tertiary pattern. 
All other scores were “‘no pattern.”’ The self 
ratings in each area were given values of from 
one to five, with five indicating greatest 
similarity of interests. ' 

Contingency coefficients and chi squares 
were then computed to indicate the degree and 
significance of relationship between the Strong 
Blank and self ratings and between the Kuder 
Test and self ratings.’ 


Results 


The three indices of interests first can be 
compared on the basis of the frequency of 
significant scores in the nine areas. Kuder 
says that percentile scores of 75 and above on 
his test are significant for purposes of voca- 
tional counseling (10). Strong says that 
scores of A and B+ on his test are significant 
(16), and Darley states that primary patterns 
on the Strong test are most significant (5). 

Table 1 reveals that the number of signifi- 
cant scores, as defined by these authorities, is 

'The formula for the contingency coefficient was 


obtained from Guilford, P., Psychometric methods, 
New York: McGraw Hill Book Co., 1936, pp. 357-359. 


much greater on the Kuder test in eight of 
nine areas and in only one area, the sub- 
professional or technical, are there more signifi- 
cant scores on the Strong test. 

On the Kuder test, 24 per cent of the group 
had significantly high scores on the artistic 
scale, while on the Strong, only three per cent 
had significantly high scores. Only 15 per 
cent had any type of interest pattern in this 
area on the Strong. The distribution of self 
ratings was more symmetrically distributed, 
with 11 per cent claiming much interest in 
this area. 

Distributions in the musical area are some- 
what similar to those in the art area, with 15 
per cent obtaining A’s or B pluses on the 
musician’s scale of the Strong and 35 per cent 
obtaining percentile scores of 75 or more on 
the musical scale of the Kuder. Qn the self 
ratings, many more men indicated no interest 
in this area than there were men who indicated 
much interest. 

Again in the scientific areas, relatively few 
significant scores were found on the Strong 
profile, while many were found on the Kuder. 
Combining the biological and physical science 
groups on the Strong, only 74 primary patterns 
were identified. On the Kuder scientific scale, 
23 per cent received percentile scores of 75 or 
above. The differences in the social service 
area are even greater, with 14 per cent having 
primary patterns on the Strong and 49 per 
cent having high scores on the Kuder. The 
trends are similar in the business and literary 
areas. 

In the technical and mechanical area, 30 
per cent had primary interest patterns on the 
Strong and only 19 per cent had high scores 
on the Kuder. In only one of the nine areas, 
the sales area, were there more primary Strong 
patterns than there were in this technical area. 
On the Kuder, fewer people had scores of 75 
or more on the mechanical scale than on any 
other scale. 

Thus, on the Kuder, the areas having the 
greatest number of high scores were the sales 
(persuasive) and the social service areas and 
on the Strong, the areas having most primary 
patterns were the sales and the technical areas. 
The areas on the Kuder test having the fewest 
high scores were the mechanical and clerical 
areas and on the Strong the areas having the 





Scores on the Strong Vocational Interest Blank 47 


fewest primary patterns were the artistic and 
the physical science areas. 

In terms of self ratings, most students rated 
themselves as very much interested in physical 
science and technical work and fewest rated 
rated themselves as very interested in music 
and art. 

On the Strong test there was a total of 639 
primary patterns, or an average of 1.3 per 
student and a total of 481 secondary patterns. 
Combining these figures, the average student 
had 2.2 primary and secondary patterns. On 
the Kuder, 1463 scores were at or above the 
75th percentile. The average student had 
2.9 scores which according to Kuder are signifi- 
cant in counseling. In the norm group upon 
which these percentile scores were based, for 
a group of 500 men, one would find 1125 per- 
centile scores of 75 or above out of the 4500 
possible scores available on the nine scales of 
the test or an average of 2.3 significant scores 
per man. 

The relationship found here between meas- 
ured and expressed interests approximates 
that reported in previous studies. The median 
contingency coefficient between the Strong 
Test and self ratings was .43, between the 
Kuder test and self ratings, .52. Table 2 
presents the coefficients showing the degree of 
relationship in each area between each of the 


Table 2 


Contingency Coefficients Showing Relationships Be 
tween Self Ratings of Vocational Interests and 
Scores on the Strong Vocational Interest 
Blank and on the Kuder Preference 
Record 


C with 
Strong 








C with 

Kuder 
Technical 55 47 
Computational 61 34 
Physical Sciences 32 46 
Social Service 43 52 
Musical 39 net) 
Sales 58 58 
Biological Sciences 27 wo 
Verbal—Literary 51 6! 
isti 33 .58 
(.61)* 52 


Occupational 
Area 


* This is the same statistic as for “Computational.” 


two tests and self ratings. The chi squares 
were all statistically significant beyond the one 
per cent level of probability. 

In two areas, the Strong scores are more 
closely related to self ratings than are the 
Kuder scores. In five areas, the Kuder scores 
are more closely related to self ratings. The 
range of contingency coefficients for the Strong 
is from .27 to .61, for the Kuder, from .30 to .61. 

On both tests, self ratings of interest in the 
biological sciences are least related to relevant 
scores on the tests. The next poorest agree- 
ment is found in the physical science area where 
self ratings have a relationship in terms of 
contingency coefficients to Kuder and Strong 
patterns of .46 and .32 respectively. Self 
ratings in the sales area tend to be related to 
patterns on both tests. 

Disagreement is found in the clerical and 
computational areas. The patterns on the 
Business-Office group of the Strong tend to be 
accompanied by similar self ratings. The 
Kuder computational scores were only slightly 
related to these self ratings. The Kuder 
clerical scores were more related to clerical 
self ratings than were Kuder computational 
scores. 

In the case of those areas which have strong 
avocational implications, art and music, the 
relations between self ratings and Kuder scores 
are greater than those between self ratings and 
Strong scores. The explanation is perhaps that 
most students are unable to differentiate their 
thinking about vocational and avocational in- 
terests. Clinical experience suggests these two 
scales of the Kuder test are much more meas- 
ures of avocational interests than of voca- 
tional interests. 

When the results obtained here are com- 
pared to those reported by Crosby (4), the 
relative ease of self-estimating scores in the 
persuasive area is apparent. Crosby found, for 
two groups, correlations of .62 and .66 between 
Kuder persuasive scores and_ self-estimated 
scores in this area, as compared to a con- 
tingency coefficient of .58 found here. Simi- 
larly, in the literary area, Crosby’s correlations 
of .51 and .61 can be compared to the contin- 
gency coefficient of .61 obtained here between 
self ratings and the Kuder. 


ete ee ere 








Ralph F. Berdie 


Discussion 

The results presented here are in general 
agreement with the results obtained by other 
investigators and the correlation between 
measured and self-estimated interests approxi- 
mates .50. In agreement with Paterson’s 
hypothesis concerning the relative subtlety of 
the two tests, scores on the Kuder tend to have 
a closer relationship to self ratings of interests 
than do the scores on the Strong. This may 
be a function not only of the items in the tests 
but also of the categories used in grouping the 
scales and defining the self ratings, although 
these categories were achieved through careful 
study of both tests. 

The men studied here found it relatively 
difficult to estimate their own scientific inter- 
ests, as they were measured by the tests, and 
on the other hand, the men were more able to 
estimate their persuasive and sales interests. 
In no occupational area, however, was there 
close enough agreement between measured in- 
terests and self-estimated interests to suggest 
that counseling can be done on the basis of one 
or the other. -As long as measured interests 
have a relevancy for vocational satisfaction 
and as long as self-estimated interests play an 
important role in the vocational deliberations 
of individuals, both types of interests must be 
considered. 

The counselor working with individuals 
similar to those in this sample can expect to 
find many more significant scores, as defined 
by the test authors, on the Kuder blank than 
on the Strong test. Here each person averaged 
2.9 scores of 75 or above on the Kuder and only 
1.3 primary patterns on the Strong. This is a 
result of the Kuder norms being based on 
groups very similar to those studied here in 
contrast to the Strong scores which are based 
on norms derived from occupational groups (6). 
Both kinds of scores can be useful in counseling 
but the counselor should recognize that in 
every case but that of the technical-mechanical 
area, a “significant” score on the Strong test 
is a more rare occurrence than a “significant” 
score on the Kuder. 

Whereas one-half of a group similar to this 
will have high social service scores on the 
Kuder, only 14 per cent will actually have 
interests that resemble those of men in social 


service work, as implied by their primary pat- 
terns in group V on the Strong test. Over 
one-half, however, will indicate they bear a 
marked resemblance to the men in this field, 
as shown by their self ratings. Strangely 
enough, all three indices can be correct. 
Strong discusses the predominant similarities 
of interests and would perhaps agree that most 
of these subjects are more like men in social 
service than they are unlike them. The 
Strong blank does not take into account these 
similarities, however, but rather, utilizes the 
differences between groups, and consequently, 
relatively few of this sample obtain primary 
patterns in the social service area. In other 
words, the subjects who receive primary 
patterns in group V share with the men in the 
standardization groups those interests which 
tend to make them different from men in 
general. We can only say that those subjects 
who obtain high scores on the social service 
scale of the Kuder are showing interests which 
are shown by people in the social service areas, 
many of these interests also being held in 
common with men not in these areas. 

One explanation for the relatively greater 
ease of estimating scores on the Kuder may 
thus be derived. In estimating Kuder scores, 
the subject needs consider only his similarities 
to men in the defined groups, but in estimating 
Strong scores, he needs to consider both how 
he resembles men in the defined group and 
also ‘how he differs from men in general. 

These considerations do not adequately ex- 
plain the degree of agreement between self- 
estimates of interests and test scores. Recog- 
nizing that this agreement is reduced by the un- 
reliability of test scores and the inconstancy of 
self ratings, what are those factors which are 
related to both interest scores and _ self- 
estimates? 

Preliminary determinations of some of these 
factors have been reported (1). In an in- 
tensive study of 136 college students, college 
aptitude was found to be significantly related 
to measured interests in engineering but not 
to expressed interests. Those students with 
measured engineering interests participated in 
fewer religious activities than did those stu- 
dents with no measured interests. This dif- 
ference in activities was not found when the 
group was analyzed on the basis of expressed 





Scores on the Sirong Vocational Interest Blank 49 


engineering interests. Preference for high 
school mathematics teachers was related to 
measured interests in engineering but not to 
expressed interésts. Family background was 
related equally to both types of interests. 
When business interest groups were compared, 
morale scores on the Minnesota Personality 
Scale were related to expressed interests in 
business but not to measured interests, and 
frequency of “social” activities bore the same 
relationships. An elaboration of this type of 
study would throw light on the differences be- 
tween measured and expressed interests. 

Other questions arise. Why is there a 
tendency for scores and self-estimates in some 
areas to be more closely related than in other 
areas? What distinguishes those people who 
can quite accurately estimate their interests, 
as measured by tests, from those who cannot? 
Many questions remain to be answered con- 
cerning the relationships between measured 
and estimated interests. 


References 


1. Berdie, R. F. Factors associated with vocational 
interests. J. educ. Psychol., 1943, 34, 257-277. 

2. Bordin, E. S. A theory of vocational interests as 
dynamic phenomena. Educ. & Psychol. Meas., 
1943, 3, 49-65. 

3. Christensen, T. E. Some observations with re- 
spect to the Kuder Preference Record. J. educ. 
Res., 1946, 40, 96-107. 

4. Crosby, R. C., and Winsor, A. L. The validity of 
student estimates of their interests. J. appl. 
Psychol., 1941, 25, 408-414. 


5. Darley, J. G. 
the Strong Vocational Interest Blank. 
Psychological Corporation, 1941. 

. Diamond, S$. The interpretation of interest pro- 
files. J. appl. Psychol., 1948, 32, 512-520. 

. Gordon, H. C., and Herkness, W. W. Do voca- 
tional questionnaires yield consistent results? 
Occupations, 1942, 20, 424-429. 

. Kelso, D. F.,and-Bordin, E. S. The ability to 
manipulate occupational stereotypes inherent in 
the Strong Vocational Interest Test. Amer. 
Psychol., 1948, 3, 352-353. 

. Kopp, T., and Tussing, L. ‘The vocational choices 
of high school students as related to scores on 
vocational interest inventories. Occupations, 
1947, 25, 334-339. 

. Kuder, G. F. Kuder Preference Record, Revised 
Manual. Chicago: Science Research Associates, 
1946. 

. Laleger, G. E. Vocational interests of high school 
girls. New York: Bureau of Publications, 
Teachers College, Columbia Univ., 1942. 

. Longstaff, H. P. Fakability of the Strong Voca- 
tional Interest Blank and the Kuder Preference 
Record. J. appl. Psychol., 1948, 32, 360-369. 

Paterson, D. G. Vocational interest inventories in 
selection. Occupations, 1946, 25, 152-153. 

. Rose, W. A comparison of relative interest in 
occupational groupings and activity interests as 
measured by the Kuder Preference Record. 
Occupations, 1948, 26, 302-307. 

. Steffire, B. The reading difficulty of interest in- 
ventories. Occupations, 1947, 26, 95-96. 

Strong, E. K. Vocational interests of men and 
women. Stanford: Stanford Univ. Press, 1943. 

. Thorndike, E. L. Adult interests. New York: 
Macmillan, 1935. 

. Wittenborn, J. R., Triggs, F. O., and Feder, D. D. 
A comparison of interest measurement by the 
Kuder Preference Record and the Strong Voca 
tional Interest Blanks for men and women 
Educ. & Psychol. Meas., 1943, 3, 239-257 


Clinical aspects and inter predation of 
New York 








Visual Differentiation of Moving Objects 


Newell C. Kephart and Guy G. Besnard 
Division of Applied Psychology, Purdue University 


Much work has been done at the Occupa- 
tional Research Center, Purdue University, in 
cooperation with various industrial plants in an 
attempt to find skills which are pertinent to 
performance on various industrial jobs. At 
the present time employees of cooperating 
plants are tested and a statistical comparison 
of job performance and test scores is made. 
Recently, research has been attempted to see 
whether it would be possible to attack the 
problem from: another angle by setting up in 
the laboratory cow terparts of industrial jobs 
or parts of jobs. Av one such approach the 
study of the visual cifferentiation of moving 
objects was selected. 


Experimental Procedure 


Description of Apparatus and Material. The appa- 
ratus consisted of an inclined trough down which clear 
glass spheres could be rolled. This trough was made 
of wood, 46 inches long and 1.5 inches wide. The 
inside was lined with white paper since preliminary 
experiments showed that a dark lining made the task 
too easy. A second trough underneath the first one 
and sloping in the opposite direction allowed the 
spheres to return to the experimenter in the original 
sequence (see Figure 1). 

Twenty clear glass marbles of the ordinary commer 
cial type were provided. Of these, 10 were left plain 
and the remaining 10 were scratched with a sharp 
instrument so that three circles were on three circum 
ferences in three perpendicular planes. For the pur- 
pose of terminology in the test situation the unmarked 
spheres were called “good” and the etched spheres 
“had.” 

A head rest was provided to keep the subject's head 
in a specified position during the experiment, so that 
each subject would view the spheres from approxi- 
mately the same distance and at approximately the 
same angle. In the early experiments, the spheres 
were released at the top of the trough and allowed to 
roll down by themselves. Since the spheres were im- 
mobile at the time of release, their rate of speed near 
the top of the trough was so slow that they were very 
easily seen by the subject and the discrimination was 
too easy. In order to make the task more difficult, we 
had to eliminate the possibility of seeing the sphere 
until it had attained adequate speed. In order to do 
this, a screen was placed in such a position as to permit 
the subject to view only the last 18 inches of the trough, 
but not to view the first 28 inches 


The room in which the experiment was conducted 
was kept in darkness except for a 10.5 inch, 40 watt 
tubular Lumiline bulb, mounted in a clear tin reflector. 
This light was placed 12.5 inches above the table and 
directly above the last 18 inches of the trough. By 
keeping the different physical aspects of the material 
and the apparatus constant we hoped to reduce our test 
to one variable, that of discriminating between the 
“good” and the “bad” moving spheres. 

Visual skills were measured with the Bausch & Lomb 
Ortho-Rater. This instrument provides measures of 
12 visual skills which have been found on the basis of 
scientific investigation to be important to success on 
individual jobs. The 12 skills measured are: 1. Far 
Phoria Vertical; 2. Far Phoria Lateral; 3. Far Acuity 
Both Eyes; 4. Far Acuity Right Eye; 5. Far Acuity 
Left Eye; 6. Depth Perception; 7. Color Vision; 8. 
Near Acuity Both Eyes; 9. Near Acuity Right Eye; 
10. Near Acuity Left Eye; 11. Near Phoria Vertical; 
and 12. Near Phoria Lateral. 

First Experiment. Each subject was given the roll- 
ing sphere test from two views, one from the side of 
the trough (where he could see the spheres rolling from 
right to left in front of him) and one from above the 
end of the trough (where he could see the spheres 
rolling toward him). These two trials were given at 
different times, usually on succeeding days. 

About half of the subjects were given the “side view” 
first (hereafter referred to as “side view first”), and 
later were given the “end view” (hereafter referred to 
as “end view second”); the other half were given the 
end view (end view first) and later were given the side 
view (side view second). When each subject came for 
the first time, he was given the following verbal instruc 
tions about the experiment: 


“In this experiment spheres will be rolled down the 
inclined trough. Some of the spheres are unmarked 
such as this (show example); these spheres we call 
‘good.’ Other spheres are marked such as this (show 
example); these spheres we call ‘bad.’ Please place 
your head against the head rest. I will now show you 
an example of a ‘good’ sphere rolling by (show example) 
and here is the ‘bad’ sphere (show example). As each 
sphere rolls by you on the trough, say ‘good’ if you 
think it is a good sphere, or ‘bad’ if you think it is a 
bad one. In case you are not able to discriminate 
between good or bad, say, ‘don’t know,’ but if you 
have any impression at all, say ‘good’ or ‘bad,’ depend- 
ing upon which you think it is. It is particularly 
important that you make a response for each sphere. 
The spheres will roll by rapidly but at any given time 
you will see only one. Do you have any questions? 
. . « Here comes the first sphere!” 

The 20 spheres, previously arranged in random order 





Visual Differentiation of Moving Objects 


MARBLE 


TROUGH 























TOP View 


















































-—— 
SIDE VIEW 


Fig. 1. 


and placed in the return trough, were rolled by one of 
the experimenters, one at a time for 8 sequences, 
making a total of 160 spheres for each of the two views. 
Each sphere was released at the top of the trough at 
about the time the previous sphere was passing under 
the screen and took approximately 2.5 seconds to roll by 
the entire length of the trough, but was visible to the 
subject for only .6 of a second 

Forty-six members of a class in beginning psychology 
at Purdue University were used in the first experiment 
Ortho-Rater test scores were available for these stu 
dents. Subjects were selected on the basis of their 
visual profile so that they would be fairly homogenous 
with regard to their near visual acuity and with as wide 
a range of phoria test scores as possible. 

Second Experiment. The essential differences be 
tween the first and the second experiment were: 1. the 
addition of a motor to bring up the spheres and drop 
them in the trough automatically. The motor was 
geared to drop a sphere in the trough every 2.5 seconds. 
It took these spheres approximately 1 second to roll 
down the trough, but it was visible to the subject only 
about .4 second. 2. The subjects in this last experi- 
ment were not selected on the basis of their visual 
scores but were a random sample of a class of approxi- 
mately 150. 3. The number of spheres was held to 4 
sequences of 20 spheres or 80 spheres altogether, since 
the first experiment had shown this to be a sufficient 
number of trials for reliability. 

The rest of the test procedure remained constant. 
Forty-five subjects from a beginning course in psy 
chology were used in this experiment 


Results 


The number of errors made by each subject 
(an incorrect response or a response of “don’t 


Biciteey weece inte 5 SaNanpen Ree wi Sie 


LEGEND 
A 
8 
Cc 
D0 3 RPM MOTOR 
SCALE 1"+10" 


Top and side views of apparatus. 


know”’ were considered as errors) was com- 
puted. These computations were made for 
each view separately by orders (first or second). 
In addition to the total number of errors made 
by each individual there are also totals for 


each of the sequences of 20 spheres viewed by 
the subject. 

Reliability. Reliability was computed sepa- 
rately for the end and for the side view, but no 
differentiation was made, as to whether the 


view was the first or the second. Reliability 
was measured by the odd-even technique, 
taking as the “Odds” the odd numbered 
spheres of the odd numbered 20 sphere se- 
quences plus the even numbered spheres of 
the even numbered sequences, and as “Evens” 
the even numbered spheres of the odd se- 
quences plus the odd numbered spheres of the 
even sequences. Thus each sphere was in- 
cluded alternately in the “even” and “odd” 
for each view, since minor unintentional differ- 
ences in the etching of some of the marbles, or 
inherent flaws, might affect the appearance of 
individual spheres and therefore the scores of 
the subject. The resulting “odd-even” cor- 
rected correlation for the side view of the first 
experiment was r=.87. The corrected relia- 
bility of the end view was r=.95. The relia- 
bility for the side view was influenced signifi- 
cantly by one exceptional case. With this 








52 Newell C. Kephart and Guy G. Besnard 


case omitted, the corrected reliability was 
reduced to r=.64. The apparently lower 
reliability of the side trial probably is, in large 
measure, due to the markedly lower variability 
of scores for the side due to the low number 
of errors. 

The reliability as shown in the second ex- 
periment was measured in the same manner as 
in the first experiment. The corrected coeffi- 
cient for the end view, in the second experi- 
ment, was .89. The corrected reliability of 
the side view for the second experiment was .89. 

It will be noted that in the second experi- 
ment the reliability of both the side and the 
end view are approximately the same. The 
higher reliability from the side view in the 
second experiment can probably be explained 
by the fact that the variability of scores from 
the side view was larger due to increased diffi- 
culty of the test. 

Differences Between End and Side Views. 
The mean number of misses made by each 
group on the end and on the side was com- 
puted. The mean number of misses made by 
all the subjects in the second experiment when 
looking at the spheres from the end view was 
14.97. Their mean number of misses from 
the side view was 8.48. The difference be- 
tween these means was 6.49. This difference 
is 5.31 times its standard error. 

This difference between means would indi- 
cate that the discrimination of moving objects 
is a much easier task when these objects are 
viewed from the side than when they are 
viewed from the end or coming toward the 
subject. 

The Pearsonian r was computed between 
the score made on the end view and the score 
made on the side view. This breakdown re- 
sulted in two groups. The first group con- 
sisted of subjects who took the order side view 
first, end view second. The second group 
consisted of those who took the order end 
view first, side view second. The resulting 
Pearson r’s were .07 and .31 respectively. 
In view of the size of these coefficients, it 
seems probable that different factors are in- 
volved when performing this task from the 
side and from the end. 

Correlation with. Vision Tests. In order to 
show possible relationship between the sphere 
test scores and visual skills as measured by 


the Ortho-Rater, correlation coefficients were 
computed. 

Both the Pearsonian r and an eta were com- 
puted for all the acuities and the far depth 
perception since it was not known in advance 
whether the correlation would be linear or 
curvelinear. In the case of the phoria scores 
both lateral and vertical, far and near, previous 
experience had shown that the relationship 
between success on a task and scores on these 
tests was curvelinear and therefore only etas 
were computed. 

The Pearsonian r’s varied from —.09 to 
48. The correlation ratios varied from .24 
to .79. Since the correlation ratio is not a 
highly reliable measure where the number of 
cases is small the correction suggested by 
Guilford was applied. This correction re- 
sulted in coefficients varying from .00 to .73. 
The majority of the coefficients are positive. 
We must not forget, however, that the number 
of cases used in this experiment is relatively 
small and that any generalization based on a 
careful study of the correlation coefficients will 
not be highly valid. 

It would appear, however, that the visual 
abilities of the subject as measured by the 
Ortho-Rater tests are correlated in a low but 
positive fashion with ability in discriminating 
moving obj-cts. 


Summary 


A test of ability to discriminate fine details 
in moving objects was constructed. This test 
involved discriminating between marked and 
unmarked clear glass spheres as they were 
rolling down an incline. Two test conditions 
were investigated: 1. With the spheres rolling 
toward the subject; and 2. With the spheres 
rolling laterally past the subject. 

This technique was found to have a relia- 
bility of .89 to .95 when the spheres were rolling 
toward the subject and .89 when the spheres 
were rolling past the subject. 

It is felt that studies of this type might 
well reveal desirable changes in design of in- 
dustrial machines to take account more effec- 
tively of the demands made upon the machine 
operator. Biomechanics has been concerned 
with the abilities and skills required in the 
operation of machines which are already in- 
stalled and operating. It would appear that 





Visuc! Differes:tiation of Moving Objects 53 


relatively simple laboratory stu-ties of the 
psychologica! and physiological implications 
while the machine is still in the design stage 
could make contributions resultins in equip- 
ment which, when finally installed, would be 
more efficient through an increase in the 
efficiency to be expected from the operator. 
The present study suggests the possibility of 
such an approach. 


Scores on the moving object test were cor- 
related with the scores in the visual skills tests 
measured by the Ortho-Rater. These correla- 
tions would indicate that there is a low but 
relatively consistent positive relationship be- 
tween ability to differentiate fine details in 
moving objects and visual skill test scores. 


Received A pril 25, 1949 








An Analysis of Visual Requirements in Industry * 


E. J. McCormick 


Occupational Research Center, Purdue University 


The importance of certain visual skills to 
performance on industrial jobs has been em- 
phasized by the results reported from various 
investigations such as those conducted by 
Tiffin (6, 7,8, 9, 10), Kephart (2, 3), and Stump 
(4, 5). To a considerable extent these and 
other investigations have been made with re- 
spect to employees on individua! jobs; they 
have revealed in specific situations marked 
relationships between visual skills and various 
aspects of performance such as production, 
turnover, and accident experience. The sev- 
eral studies made on individual! jobs have sug- 
gested the possible desirability of investigating 
the relationship between certain visual skills 
and performance on a variety of industrial jobs. 


Statement of the Problem 


The Occupational Research Center of the 
Division of Education and Applied Psychology, 
Purdue University, is engaged in the develop- 
_ ment of vision-test profiles for various indus- 
trial jobs. ‘These profiles are used by indus- 
trial and business organizations for the dual 
purpose of aiding in employee selection and of 
identifying those individuals whose visual 
skills may not meet the general visual demands 
of their jobs and who might therefore benefit 
from professional eye care. 

In the past the visual profiles for jobs were 
individually adapted for specific jobs on the 
basis of the relationships for present employees 
between their visual skills as measured by the 
Ortho-Rater' and their criterion measurements. 
Some such relationships might be due to chance 
factors, however, since the vagaries of human 


* This article is based on the author’s dissertation of 
the same title submitted to the Faculty of Purdue 
University in partial fulfillment of the requirements for 
the degree of Doctor of Philosophy, August, 1948. The 
dissertation was directed by Dr. N. C. Kephart. 

'The Ortho-Rater is a visual testing instrument 
manufactured by the Bausch and Lomb Optical Com- 
pany, Rochester, N. Y., for the visual classification and 
placement of industrial employees. A description of 
these tests is available elsewhere (11). 


54 


traits are of such a character that in almost 
any single situation it is likely, by chance 
alone, that some statistically significant rela- 
tionship might show up; if such a relationship 
appears to be a logical one, there is the temp- 
tation to accept it without cross-validation on 
a “hold-out” group. 

The present investigation is an extension 
of research on the relationship between certain 
visual skills. and job performance, directed 
toward providing a basis for ‘establishing 
visual-skill profiles for industrial use which 
would avoid the potential pit-falls of indi- 
vidually adapted profiles. In the light of this 
general objective two specific objectives were 
established, namely: (1) The examination of 
the relationship between visual acuity and job 
performance of employees on a number of 
different jobs in order to determine any basic 
relationships that are of general applicability; 
and (2) The possible development of a different 
method for establishing visual acuity test cut- 
offs for visual profiles for certain types of 
industrial jobs. 

Basic Data 


The basic data used in the study were the 
results of Ortho-Rater vision tests and meas- 
ures of job performance on approximately 5500 
employees on 92 jobs in a number of industrial 
establishments. This information -was ob- 
tained from the extensive files of the Occupa- 
tional Research Center. 


Selection of Hold-Out Jobs. A sample of 51 jobs that 
were in general randomly selected was drawn from files 
of the Occupational Research Center to serve as a 
“hold-out” group. This sample of “hold-out” jobs was 
drawn in order to be able to cross-validate on it the 
results of certain analyses (to be described later) that 
were made on a group of “base’’ jobs. Only jobs with 
about 20 or more employees were included. 

The 51 jobs were then empirically divided into two 
groups on the basis of the judged character of the pre- 
dominant visual demands. The first group (Group 1) 
included predominantly near vision jobs—-those which 
were considered to require close, and rather constant, 
visual attention within a relatively restricted range, 





An Analysis of Visual Requirements in Industry 55 


usually within arm’s reach. Typical of the jobs in- 
cluded in this group are hosiery loopers, sewing-machine 
operators, assemblers of small parts, and certain types 
of visual inspectors. 

The other group (Group 2) included jobs which re- 
quired varying degrees of both near and far visual 
attention, and might therefore be considered as combi- 
nation near and far vision jobs. Such jobs typically 
require some relatively close visual attention, say within 
arm’s reach, as well as some farther visual attention at 
varying distances, cither within or beyond the imme- 
diate work environment. Illustrative of jobs included 
in this group are knitting machine operators, weavers, 
spinners, and punch press operators. 

Selection of Base Jobs. In order to make certain 
types of analyses, a sample of similarly representative 
“base” jobs was selected. A careful empirical exami- 
nation was made of the relationship between visual 
acuity and the performance criteria of employees on 
these jobs. The only jobs retained were those on which 
it was empirically ascertained that there was a satis- 
factory degree of relationship between visual acuity and 
job performance. (These base jobs were so “selected” 
in order to ascertain, by subsequent analyses, the 
patterns of acuity skills which presumably contributed 
to performance on such jobs, with the thought that 
such “patterns” could later be cross-validated on the 
hold-out jobs.) A total of 41 such base jobs was re- 
tained. These base jobs likewise were divided into 
Group 1 (near vision jobs) and Group 2 (combination 
near and far vision jobs). 

Criteria of Job Performance. The criteria of job 
performance varied from plant to plant and from job 
to job, and were based on such factors as production 
data or earnings, or were the results of ratings. including 
some ratings made by the paired comparison cechnique. 
In most instances the criterion measuremen;, had been 
reduced to four or five numerical categc.ies. The 
criterion categories for each job were divided into two 
groups, usually of approximately the same size, which 
were then identified as “high criterion” employees and 
“low criterion” employees. In instances where there 
was a middle category which precluded an approxi- 
mately even division of the individuals into two groups, 
the middle criterion category was omitted. 

Vision Tests. Ortho-Rater vision-test scores were 
available on all employees of the “base” and “hold-out” 
jobs. The Ortho-Rater includes tests of the following 
visual skills: (1) Far tests (given at the optical equiva- 
lent of 26 feet): phoria, vertical; phoria, lateral; acuity, 
both eyes; acuity, right eye; acuity, left eye; depth 
perception; and color discrimination; and (2) Near tests 
(given at the optical equivalent of 13 inches): acuity, 
both eyes; acuity, right eye; acuity, left eye; phoria, 
vertical; and phoria, lateral. 

The six acuity tests were given particular attention 
in the present investigation. The results of the far 
acuity, both-eye test and of the near acuity, both-eye 
test were used directly. The results of the far acuity 
tests of the right and left eyes were compared for each 
individual and only the lower, or “worse-eye” test 
* score, was used. A similar comparison was made of 
the near acuity tests for the right and left eye. The 


four measures of visual acuity actually used were then: 
far acuity, both eyes; far acuity, worse eye; near acuity, 
both eyes; and near acuity, worse eye. 

The range of possible scores on these tests is from 
zero to 15.2 


Relationship Between Visual Acuity 
and Job Performance 


The over-all relationship between each of 
the four visual acuity tests and job performance 
is reflected by the proportion of individuals 
scoring at each score level who were high 
criterion employees. This basic relationship 
for each test is presented graphically in Figures 
1, 2, 3, and 4, which include the relationship 
for all of the 51 hold-out jobs combined, as well 
as for the Group 1 jobs and Group 2 jobs 
separately. 

Both the far and near acuity, both-eye 
graphs show (Figures 1 and 3), for all 51 jobs 
combined, a relatively marked increase in the 
proportion of high criterion employees at each 
successive score value in the lower score range, 
with a less marked, though still persistent, in- 
crease at each successive score level in the 
higher score range. When considering the 
Group 1 and Group 2 jobs separately, however, 
it will be observed for the far acuity, both-eye 
test, that the Group 1 jobs do not reflect the 
drop at the lower end of the test-score range 
that occurs for the Group 2 jobs. For the 
near acuity, both-eye test, both Group 1 and 
Group 2 have essentially the same pattern of 
relationship with job performance. 

For the two worse-eye tests (Figures 2 and 
4), the positive, relatively straight-line, rela- 
tionship from the low to high scores indicates 
that with higher worse-eye acuity (both near 
and far) the probabilities of satisfactory per- 
formance on jobs of the types studied increases 
quite constantly. A slight up-swing for the 
Group 1 jobs at the very low scores, especially 
for near acuity, worse eye, implies that for 
constant, close visual work a marked restriction 
of acuity is more conducive to job success 
than a relatively low level of acuity, presumably 
because the worse eye does not then hamper 
the effective use of the better eye as it might 
if the acuity level were only relatively low. 

? For a conversion of these scores to other common 


measures of visual acuity the reader is referred to the 
manual of standard practice (11 








E. J. McCormick 


PER CENT OF 
HIGH CRITERION EMPLOYEES 
&. 8 


—-—" 16 WEAR JO8S 
55 NEAR-FAR JOBS 
——~ ALL Si JO8S 








_: 
27346565 676980 
FAR ACUITY -BOTH EYES 


"o2 3 4 15 


Fig. 1. Per cent of high criterion employees to total 
employees at successive scores on Far Acuity, Both- 
Eye Test. 


Aside from this difference between Group 1 
and Group 2, however, the curves for the two 
groups are basically the same. 


Experimental Vision Test Profiles 


Previous mention has been made of some 
of the potential disadvantages of developing 
visual standards for individual jobs on the 
basis of the relationship between the vision 
test scores and the performance criteria of 
employees on the jobs in question. To try 
to overcome these disadvantages, two different 
methods were developed for setting visual- 
acuity standards for jobs of the types investi- 
gated. These methods resulted in the estab- 
lishment of acuity-test cut-off scores which 
were incorporated in “‘profiles’’ for the jobs. 

Acuity Profile Derived from ‘Average’ 
Acuity Scores. Yor each of the 41 base jobs 
a visual-acuity profile was individually devel- 
oped which resulted in an adequate degree of 
differentiation between the high criterion and 
low criterion employees. For each of the four 
acuity tests, the average of the Ortho-Rater 
cut-off scores which were incorporated in these 
individually-developed profiles was determined. 


The whole scores nearest the averages for the 
respective tests were established as the cut-off 
scores for the first type of experimental profile. 
This profile (designated as profile A) included 
the following cut-off scores: far acuity, both 
eyes, 8; far acuity, worse eye, 6; near acuity, 
both eyes, 8; and near acuity, worse eye, 7. 

Acuity Profiles Adjusted to the Visual Acuity 
Level of Employees on the Job. The second type 
of experimental acuity profile was designed to 
establish the cut-off scores on the four acuity 
tests on the basis of some standardized relation- 
ship to the general acuity level of the employees 
on the job in question. This “standardized 
relationship” was developed by an analysis of 
data on the base jobs, for subsequent applica- 
tion and cross-validation on the hold-out jobs. 
The specific procedures involved in the analy- 
sis of the 41 base jobs are given below. 


1. Procedures used in analyzing data on 
each of four acuity tests (Note: these are de- 
scribed in terms of a single test, near acuity, 
both eyes, but were similar for all four tests): 

a. For each of the 41 base jobs, a frequency 
distribution was made of the scores of the 
employees on the near acuity, both-eye test. 


80 


70 


60 


50 


PER CENT OF 
HIGH CRITERION EMPLOYEES 


40P 


30f- 


20 


——- 16 NEAR JOBS 
serwes JSNEAR-FAR 3085S 


—— QL $i 08S 
10 





one 





34567868 98 ON 12 3 5 
FAR ACUITY -WORSE EYE 


Fig. 2. Per cent of high criterion employees to total 
employees at successive scores on Far Acuity, Worse- 
Eye Test. 


ek Bet PP rir Ba 


Ht 





An Analysis of Visual Requiremenis in Industry 


b. For each of the 41 frequency distribu- 
tions certain percentiles were computed to the 
nearest whole-scores. The percentiles used 
were the 10th, 15th, 20th, 25th, and 30th, 
since a preliminary investigation suggested 
that the percentiles in this range would most 
adequately serve the purposes of subsequent 
analyses. 

c. For each job an individualized visual 
acuity profile had been previously developed 
(with cut-off scores on the four acuity tests) 
which was empirically judged to differentiate 
adequately between the high criterion and the 
low criterion employees. 

d. For each job the difference in score units 
was determined between the both-eye near 
acuity cut-off score of the previously developed 
profile and each of the specified percentiles on the 
test of the employees on the job. 

e. These differences in score units were then 
summarized for all 41 jobs, there being a 
separate summary of these differences for the 
10th, 15th, 20th, 25th, and 30th percentiles. 
Independent summaries of the same type were 
also made for the Group 1 jobs and for the 
Group 2 jobs. Varying degrees of “stability” 


PER CENT OF 
HIGH CRITERION EMPLOYEES 
$ a ~ 
° ° 


& 
° 


a 
is) 


- 16 NEAR JOBS 
35 NEAR-FAR JOBS 
—— ALL Si JOBS 








345 6 7 
NEAR ACUITY 


890 tt 2 
SOTH EVES 


3 4 6 


Fig. 3. Per cent of high criterion employees to total 
employees at successive scores on Near Acuity, Both- 
Eye Test. 


100 


90 


80 


PER CENT OF 
HIGH CRITERION EMPLOYEES 


—-- 16 NCAR 1085S 
35 NEAR FAR 2085 
———- ALL 5 2088 








2345 6786980 
NEAR ACUITY~ WORSE EYE 


2 13 4 5 


Fig. 4. Per cent of high criterion employees to total 
employees at successive scores on Near Acuity, Worse- 
Eye Test. 


of these differences were noted. For the 
Group 1 jobs, for example, it was observed 
that for 15 of 20 jolts the cut-off score (of the 
individually developed profiles) was one score 
unit below the 25th percentile for the respective 
jobs. It was found that corresponding differ- 
ences for some percentiles were less stable. 

2. Procedures used in setting up formulas 
for experimental profiles of four acuity tests: 

a. Several experimental profile “formulas” 
were developed on the basis of the type of 
analysis outlined above. These profile for- 
mulas embodied for the four acuity tests the 
more “stable” of the relationships reflected in 
the differences between the cut-off scores of the 
individually adapted profiles of the base jobs 
and certain percentiles. Following is an illus- 
tration (for one test) of the way in which these 
relationships were converted into profile for- 
mulas for use in establishing cut-off scores on 
the various tests for individual jobs: using the 
example given in (e) above for the near acuity, 
both-eye test, the cut-off score on this test for 
a given job would be set at one score unit 
below the 25th percentile of the distribution of 
scores of the employees on this test. 








58 E. J. McCormick 


For the severa! experimental profiles of this 
type the cut-off scores on the four acuity tests 
for a given job would be established in a 
similar manner; various combinations of such 
relationships were used in the different experi- 
mental profiles. Some profiles were developed 
for use on jobs of the type included in Group 1, 
and others for jobs of the type included in 
Group 2. These profiles were designated by 
letter, from B through K. 

b. Some of the experimental profiles of this 
type excluded certain tests, and some specified 
lower limits for cut-off scores (either in the 
form of specified percentiles or specific scores). 
These experimental variations were suggested 
on the basis of more detailed analysis of the 
data on the 41 base jobs. 

Profiles of Phoria, Depth, and Color Tests. 
In order to examine the comparative results of 
the varios experimental acuity profiles by 
themselves, and also when phoria, depth, and 
color tests were added to the battery, an aux- 
iliary profile which included a specified com- 
bination of cut-off scores on these tests was 
added to all of the acuity profiles. This aux- 
iliary profile (profile X) was developed in con- 
junction with a companion study (1), and 
consisted of the test cut-off scores which were 
most frequently included in the profiles of the 
base jobs as originally established in the regular 
procedures used by the Occupational Research 
Center The combination of the X profile 
with an acuity profile is indicated by the 
addition of X to the alphabetical identification 


of the acuity profile, as BX, or KX 


Application of Experimental Profiles 


The various experimental profiles, devel- 
oped from data on the base jobs, were applied 
to the hold-out job groups as indicated below 
for the purpose of cross-validation: Group 1 
jobs: profiles A, AX, B, BX, C, CX, D, DX, 
E, EX, F, FX, G, GX, H, HX; and Group 


* Profile X included the following cut-off scores 


Test 

Far vertical phoria, and near vertical phoria 
Low-end cut-off (left hyperphoria) 
High-end cut-off (right a4 a 

Far lateral phoria, and near lateral! phoria 
Low-end cut-off (esophoria) 
High-end cut-off (exophoria) 

Depth Perception 

Color Discrimination 


Score 


2 jobs: profiles A AX, B, BX, I, IX, J, JX, 
K, KX. 

The “A” and “AX” profiles included fixed 
cut-off scores on the various tests. For the 
other profiles (B through K and BX through 
KX) the acuity-test cut-off scores: for the 
individual jobs were set by the formulas pre- 
viously described. 

For each profile applied to a given job the 
test scores of each employee were examined, 
and each individual who failed one or more of 
the tests (whose test score on a given test was 
at or below the cut-off score for that test) was 
considered as failing the profile. 

For each profile applied to each job, the 
numbers of high criterion employees and of low 
criterion employees who passed and who failed 
the profile were determined. The proportions 
of those passing and of those failing each 
profile were then determined, and the critical 
ratio of the difference between these propor- 
tions was computed. 

The subsequent step involved the deter- 
mination of the effectiveness of each profile 
on the entire group of jobs to which it was 
applied. For this purpose an over-all critical 
ratio was computed.‘ 


Results of Experimental Profiles 


Table 1 presents the results of the several 
experimental profiles on the entire group of 
jobs to which they were individually applied. 
This table includes for each profile the over- 
all critical ratio previously mentioned. This 
over-all critical ratio for a given profile implies 
the probability that the differentiation be- 
tween the high criterion and low criterion 
employees could be due to chance factors. 

Acuity Profile. When examining the over- 
all critical ratios of the acuity profiles for 
the hold-out jobs it will be observed that 

* The over-all critical ratio for each profile was com- 
puted from the following formula applicable for replica 


tions of similar experiments, as provided by Dr. I. W. 
Burr, Department of Mathematics, Purdue University: 


Mer, Mer; 
*Mer, a 
VN 
in which CR, stands for the critical ratio for all jobs 
combined, Mcr, stands for the mean of the critical 


ratios of the individual jobs, and N stands for the 
number of jobs. 


CR, = 





An Analysis of Visual Requirements in Industry 59 


Table 1 


Summary of Results of Experimental Vision Test 
Profiles on 51 Hold-out Jobs 


Over-all 


Profile CR Profile 


Group 1 (16 near vision jobs) 


2.46 AX 
2.03 BX 
2.35 
1.69 
2.86 
2.81 
1.97 


Over-all 
CR 


4.13 
4.23 
5.04 
4.08 
4.26 
4.08 
4.17 


3.52 HX 5.16 
Group 2 (35 combination near and far vision jobs) 

7.07 AX 5.95 
8.75 BX 21 
7.31 IX 5.59 
9.14 Ix 6.31 
8.10 KX 6.08 
Group 1 and 2 (31 Jobs 

7.24 AX 

8.39 BX 


they range from 1.69 (5 per cent confidence 
level) to 9.14, which is far beyond chance 
expectations, with most of them being above 
the 2 per cent level (2.33). In terms of the 
over-all critical ratios, therefore, most of the 
acuity profiles may be considered as producing 
a differentiation that is beyond reasonable 
chance expectations. 

The various acuity profiles applied to the 
Group 1 jobs all produced essentially the same 
results on that group; similarly, the several 
profiles applied to the Group 2 jobs all gave 
approximately the same results on this group 
of jobs. In general, however, the results for 
the Group 2 jobs are more pronounced than 
for the Group 1 jobs. It was because of this 
difference in over-all results that more experi- 
mental profiles were developed and applied to 
the Group 1 jobs than to the Group 2 jobs. 

With regard to the results of the profiles on 
the Group 1 jobs, however, profile H (which 
excludes the far acuity tests and provides more 
rigid standards for the near acuity tests) re- 
sulted in a somewhat greater degree of differ- 
entation than the other profiles. This implica- 
tion of the relatively restricted importance of 
far acuity on such jobs (in comparison with 


near acuity) is in accord with the over-all rela- 
tionships previously discussed. 

The degree of differentiation between high 
criterion and low criterion employees that re- 
sulted from the experimental profiles might 
logically be considered as reflecting a minimum 
degree of the true relationship between the 
visual skills and job performance. Among the 
several possible factors that might contribute 
to this under-evaluation are the character of 
the criteria for some jobs, possible differences 
in illumination, differences in specific duties 
of various individuals included in the same job 
category, differences in age, and the possibility 
that certain jobs reflect a restricted range of 
talent due to the possible elimination after 
placement of some of the visually unfit. 

It is also pointed out that in the case of 
five of the Group 1 jobs and five of the Group 
2 jobs it was ascertained empirically that not 
even an individually adapted job profile would 
result in a reasonable degree of diiferentiation, 
perhaps due to inadequate criteria or chxnce 
factors. For such jobs it therefore cou'd not 
be expected that a standardized protile or 
profile formula would result in a statistically 
significant differentiation. 

Profiles Including Acuity, Phoria, Depth, 
and Color Tests. In addition to the application 
of the various acuity profiles by themselves, 
the acuity profiles were also individually ap- 
plied with the addition of the auxiliary X 
profile, which includes a set of fixed scores for 
the phoria, depth, and color tests. The addi- 
tion of the X profile to the acuity profiles 
applied to the Group 1 hold-out jobs consis- 
tently resulted in additional discrimination. 

With the Group 2 jobs, however, the addi- 
tion of the X profile to the various acuity 
profiles resulted in a moderate decrease from 
the results obtained from the acuity profiles by 
themselves, although even then the over-all 
critical ratios were from 5.59 to 6.31. While 
this actual observed decrease was traced pri- 
marily to the systematic influence of a negative 
relationship for a very few specific jobs, there 
does not appear to be any evidence that the 
addition of the phoria, depth, and color tests 
adds particularly to the differentiation on 
Group 2 jobs over that obtainable with the 
acuity tests by themselves. Presumably the 
acuity tests tapped a good share of the meas- 








60 E. J. McCormick 


urable relationships between visual skills and 
performance on the jobs in this group. 

Since the phoria, depth, and color tests 
apparently contribute to the differentiation 
of Group 1 jobs over that produced by the 
acuity profiles, one inference that might logi- 
cally be made is that in the case of jobs which 
require relatively constant and close visual 
attention, the adequate adjustment of con- 
vergence as measured by the phoria tests, and 
possibly the ability to perceive depth relation- 
ships, presumably are related to satisfactory 
performance on such jobs. 


Implications 


The results of the investigation suggest 
certain factors which might be considered in 
the establishment of visual standards for jobs 
of the type studied. 

For example, the basic relationships be- 
tween the four visual acuity tests and job 
performance reflected in Figures 1, 2, 3, and 4 
imply that there should be a moderate lower 
limit for near acuity in both eyes for both near 
vision jobs (Group 1) and for combination 
near and far vision jobs (Group 2), arid also a 
moderate lower limit for far acuity for com- 
bination near and far vision jobs (Group 2). 
Aside from these moderate minima, however, 
it appears that the visual requirements for 
jobs of the types studied are relative rather 
than absolute; it would seem then that various 
specific visual standards which require rela- 
tively adequate degrees of acuity may be ex- 
pected to produce, in the over-all, relatively 
satisfactory results in differentiating between 
high criterion and low criterion employees on 
jobs of the types investigated. 

In addition, it would seem that for most 
effective results the visual acuity standards for 
a given job should in some way be adjusted to 
the visual acuity level of employees on the job; 
this implication is derived from the fact that 
the experimental profiles that were so adjusted 
tended to produce a somewhat greater degree of 
differentiation between the high criterion and 
low criterion employees than did the fixed- 
score acuity profile (profile A) which provided 
the same set of cut-off scores for all jobs. 

With regard to the possible use of standard- 
ized profile formulas for setting visual acuity 


job-standards, it should be pointed out that 
while no such formula can be expected to 
produce the degree of differentiation among 
present employees that is possible by developing 
individually adapted profiles for specific jobs, 
nevertheless, the use of standardized profile 
formulas seems to offer a more adequate basis, 
in the long run, for predicting job performance 
on most jobs of the types investigated and for 
the referral of employees for professional 
eye care. 
Summary and Conclusions 


An investigation was made of the relation- 
ship between certain visual skills and perfor- 
mance on various types of industrial jobs, using 
Ortho-Rater scores of approximately 5500 em- 
ployees on 41 “base” jobs and 51 “hold-out” 
jobs, and measures of job performance which 
identified high criterion employees and low 
criterion employees. The investigation was 
made with particular attention to four meas- 
ures of visual acuity, namely, far and near 
acuity in both eyes, and far and near acuity 
in the worse eye. The investigation included 
two phases: first, an analysis for each of the 
four tests of the proportion of all individuals 
on the hold-out jobs who scored at each score 
level who were high criterion employees; and 
second, the development and application of 
several experimental vision test profiles and an 
analysis of their results. § 

The primary conclusions are as follows: 


1. Visual acuity requirements for satis- 
factory performance on jobs of the types 
studied seem to be general and relative rather 
than specific and absolute. 

2. A moderate minimum of near visual 
acuity in both eyes is especially pertinent to 
satisfactory performance for both “near’’ and 
“combination near-far’’ jobs, and a moderate 
minimum of far visual acuity is equally per- 
tinent to performance on “combination near- 
far” jobs; beyond such minima, performance 
increases quite constantly with greater degrees 
of acuity, but at a more gradual rate. 

3. The increase in the probability of job 
success with increases in worse-eye acuity (both 
near and far) is relatively constant over the 
entire range of scores. 

4. The several experimental acuity profiles 
(developed from data derived from the base 





An Analysis of Visual Requirements in Industry 61 


jobs and applied to the hold-out jobs) gave 
substantially similar results on the group of 
jobs to which they were individually applied 
(one group of near vision jobs and another 
group of combination near and far vision jobs). 

5. The results of practically all the profiles 
were beyond reasonable chance expectations, 
although the results were more pronounced on 
the combination near and far jobs than on 
near jobs. 

6. The addition to each of the acuity profiles 
of a standard set of fixed scores on phoria, 
depth, and color tests contributed noticeably 
to the differentiation between high criterion 
and low criterion employees on the near jobs, 
but not on the combination near and far jobs. 

7. The possible use of standardized profiles 
shows definite promise of providing, in the 
long run, relatively satisfactory standards of 
visual skills for jobs of the types investigated. 
Received May 6, 1949. 


References 


1, Carr, E. R. Am analysis of the relationship of 
phoria, depth perception and color discrimination 


to job performance. Master's thesis, Purdue 
Univ., 1948. 


. Kephart, N. C. Visual skills and labor turnover. 


J. appl. Psychol., 1948, 32, 51-55. 


3. Kephart, N. C. An analysis of professional eye 


care and industrial efficiency. Trans. Amer. 
Acad. Ophthal. & Otol., March-April, 1946, 
166-178. 


. Stump, N. F. Spotting accident-prone workers by 


vision tests. Fact, Mgmt. & Maint., June, 1945. 


. Stump, N. F. Visual functions as related to acci- 


dent-proneness. Personnel, 1944, 21, 50-56, 


. Tiffin, J. Industrial psychology. (2nd Ed.) New 


York: Prentice-Hall, Inc., 1947. 


. Tiffin, J. Vision and industrial production. /ium. 


Engng., 1945, 40, 230-257. 


. Tiffin, J. The use of visual data as an aid to in- 


crease production and efficiency. Trans. Amer. 
Acad. Ophthal. & Otol., January-February, 1944. 


. Tiffin, J., and Greenly, R. J. Employee selection 


tests for electrical fixture assemblers and radio 
assemblers. J. appl. Psychol., 1939, 23, 240 
263. 


. Tiffin, J., and Rogers, H. B. The selection and 


training of inspectors. Personnel, 1941, 18, 
14-31. 


. Standard practice in the administration of the 


Bausch and Lomb occupational vision tests with 
the Ortho-Rater. Bausch and Lomb Optical 
Company, Rochester, N. Y., 1944. 








The Effect of Ordinal Position upon Responses to 
Items in a Check List 


Donald T. Campbell and Phillip J. Mohr 
Departments of Psychology and Speech, The Ohio State University 


With the increased use of questionnaires 
and check lists has come a healthy awareness 
of the many possible sources of bias in such 
instruments. One source of bias, which this 
paper considers, is position effect. Two types 
of position effect may be noted: (1) the effect 
of ordinal position, or position per se; and (2) 
the effect of specitic sequences, involving con- 
tent interaction. The effect of ordinal posi- 
tion has been mentioned in standard works on 
the questionnaire method (e.g. 2) and this 
possibility has been regarded by some pro- 
fessional survey workers to be of sufficient 
importance to necessitate systematic controls. 

This anxiety is justified by surprisingly little 
published research. In factual tests, where 
responses presumably are based on knowledge, 
studies have shown significant ordinal differ- 
ences for the alternatives of a multiple choice 
item, although with somewhat inconsistent re- 
sults (1,9, 11). Mathews has similarly studied 
the effect of different left-right arrangements of 
response alternatives for the items of an in- 
terest test (12). In opinion poll studies fra- 
mentary reports indicate possible position 
effects (4;5, pp. 34-5;7). Of these, Cantril (5) 
finds significant but inconsistent differences 
for various orderings of two or three alterna- 
tives to a single poll question. From the 
available abstracts of the two other studies, 
it would appear that specific sequence effects 
(in which content interaction might be in- 
volved) are confounded with purely ordinal 
effects (such as might be due to primacy, 
recency, fatigue, or the like). No studies have 
been located in which the effect of position 
per se has been studied systematically. 

Our study attempts to isolate the effect of 
ordinal position in a preference check list 
already in use in professional work. We 
borrow from the annual Iowa Radio Audience 
Survey. The director, Dr. F. L. Whan, has 
-been one of those who have been worried by the 
‘possibility of position effect, and has attempted 


to control it by using in rotation multiple forms 
of a radio program types list. With Dr. 
Whan’s permission, we have taken with slight 
modification the check list of 16 radio program 
types used in the 1946 survey. Modifications 
in procedure have been introduced only insofar 
as necessitated by group administration in 
college classes. 

The check list, as we used it, had the follow- 
ing instructions: 


“Listed below are sixteen general types of 
radio program materials. In the spaces pro- 
vided at the right, check the FJVE types 
that you like BEST. Please check ONLY 
FIVE—no more, no less.” 


Half of the questionnaires listed the 16 
types with examples provided for each type. 
The examples, as in Dr. Whan’s studies, repre- 
sented well-known popular programs with com- 
paratively high “Hooper” ratings. The other 
half of the questionnaire gave no examples. 
We wanted to determine whether or not rela- 
tive ambiguity, i.e., absence of examples, would 
accentuate any position effect, on the assump- 
tion that if the stimuli (the program types) 
were less well defined respondents might be 
more apt to choose with reference to position 
of an item on the check list rather-than to its 
content. The 16 program types are listed 
below, with the asterisk indicating the termina- 
tion of the items in the “without examples’’ 
series: 


1. Livestock and Grain Market Reports* 
2. Talks on Farming and Farm Problems* 
3. News Broadcasts*, including local news, network 
commentators and farm news 
4. Talks and Discussions of Important Public 
Affairs*, such as talks by congressmen, American 
Town Meeting of the Air, etc. 
. Devotional Programs*: Sermons, talks on reli- 
gion, etc. 
. Hymns, Religious Music*, such as Hymns of All 
Churches, Salt Lake City Tabernacle Choir, etc. 
7. Classical or Semi-Classical Music*, such as Met- 





Effect of Ordinal Position on Responses to Items 


Ky 


és 


> 


Se 
© 
CL isis 


LES: 


GELS 


&£ 


CA 


4 


s 
g 

















Ri-|O1o 





{ 
j 





4 























ORDINAL POSITIONS 




















<|o/x|z|O|M| >| aI |ololo|@|2z/—-|z 























FBABGBHAGFLCPNBVWSYN= 
VIZ IM|Nlolo|zl|zic|X/@ Pic jojo \|— 
O|P|ViZirixziZioimicioa 

riclolvio|xlo|ziai|oizi~ —|>| sin 
zio|nlo| viz violp|zjc|—j|Ojr mx 
miole|>|@] 0) niz|x/r-|—|x/Q/Z|@/o 
olxiwim|Zir| viola] —| ajc | O/ I) 2|> 
oir lole|zZ/O}m| 0|—|o| > |x| F/O) Hi =z 


T]Aj/O|F|A|P/O|—| VIM/O/2z/cj/O/r |@ 


Plzizlole|A\—j|a|O) vir | Zima) AO 
olo|Z\o|x\—|r- |x| zl 4) Vj@| pic Om 
xim|rlo|—|c|zl plo|a/ Z| vain o|2 
WZ |—| Mi ZiQ| Di T/O/AjO} VIO/C |r 
alo|—|x| ni@le|mjo| Z| x\-| 2) 0| H/o 
Zi-|ziwmlololo|r | a >| ma zx) Vie 
—|alolr| > |O|x|c|Z/ 20/0) WM TIO 





























7] 
ro) 
~ 


ropolitan Opera, New York Philharmonic Orches- 
tra, Andre Kostelanetz, etc. 

. Old-Time Pioneer or Western Music* 
National Barn Dance, etc. 

. Popular Music and Popular Orchestras*, such as 
Fred Waring, Lucky Strike Hit Parade, Vaughn 
Monroe, etc. 

. Brass Bands*, such as the U.S. Army, Navy, or 
Marine Banas, etc. 

. Comedians*, such as Jack Benny, Bob Hope, 
Fred Allen, etc. 

. Variety Programs*, without featured comedians, 
such as Breakfast Club, Arthur Godfrey's Talent 
Scouts, etc. 

. Quiz or Audience Participation Programs*, such 
as Break the Bank, Quiz Kids, Vox Pop, Truth 
or Consequences, etc. 

. Complete Dramatic Shows*, such as Lux Radio 
Theatre, Screen Guild, etc. 

. Daily Continued Story Serials*, such as Ma 
Perkins, When a Girl Marries, Lum and Abner, 
etc. 

. Broadcasts of Sports Events*, such as broadcasts 
of football, basketball, baseball games, fights, etc. 


Experimental Design. A “latin square” 
provides the ideal design for such a study (8), 
but to our knowledge it has not previously 
been used for this purpose. Through this 
device, we prepared sixteen check list forms, 
with each item appearing once on each form, 
and once in each ordinal position. Subject to 
these restrictions, the distribution of items in 


such as 


The latin square design (letters designate questionnaire forms). 


the latin square might well have been random. 
But Fisher (6, pp. 267-269) points out that 
“in a well-planned experiment certain restric- 
tions may be imposed on the random arrange- 
ment of the plots in such a way that the experi- 
mental error may still be accurately estimated.”’ 
As mentioned before, this study was designed 
to measure the effect of ordinal position, 
apart from sequence of content interaction 
effect. Therefore, rather than using a com- 
pletely random latin square, we used one in 
which the sequence was systematically con- 
trolled by having each program type preceded 
once and followed once by every other type.' 

The latin square is shown in Figure 1. The 
letters A through P identify the sixteen basic 
questionnaire forms. For example, in form A, 
Classical Music was first, Dramatic Shows 
decond, Public Affairs third, etc. Because of 
the use of the “with examples”’ series and the 
“without examples”’ series, a total of 32 differ- 
ent forms was actually used. 

! The writers are indebted to Dr. D. R. Whitney and 
the Statistics Laboratory of the Department of Mathe 
matics, The Ohio State University, for providing this 
“homogeneous” latin square and for doing the bulk of 
the calculations. In passing, it may be noted that 
squares of this sort are available only for even numbers 


(3). The Statistics Laboratory will be happy to pro 
vide such squares to those interested. 





] 





64 


The Sample. The questionnaire was sub- 
mitted to 1280 students in 35 beginning classes 
in Psychology and Speech. Forty students 
filled ineachform. Eight hundred and eighty- 
four of the respondents were men, 374 women, 
with 22 neglecting to indicate sex. 

Administration of the Survey. No advance 
warning was given to the students. Each in- 
structor passed out the questionnaires, with 
these instructions. “I am sure this is self- 
explanatory. Do not collaborate with your 
neighbors. Do not ask about or mention 
aloud any program or specific radio program, 
since that might influence the response of 
others.” We are reasonably sure that the 
students felt that this was simply a survey of 
radio program preferences of college students. 

The 32 forms were distributed randomly in 
all classes, each respondent’s ballot being dif- 
ferent in arrangement of choices from those of 
his neighbors. 

Results 


The Analysis of Results. The results are 
presented in Figure 2, combining cases with 
and without examples. They are presented 
graphically in Figure 3, keeping these two 


Donald T. Campbell and Phillip J. Mohr 


sets of data separate. In Figure 2, the fre- 
quency of the choice of each program type is 
given for each of the 16 positions. For ex- 
ample, 40 respondents selected “Broadcasts of 
Sports Events” (No. 16) when that item ap- 
peared in the Ist position; 37, when it appeared 
in the 2nd position, etc. The maximum 
possible for any program type in any given 
position is 80. The totals at the bottom indi- 
cate the frequency of choice of the program 
types; the totals at the right, the frequency 
of choice of the ordinal positions. Note that 
the frequency of choice of each program type 
in each position is summed in the marginal 
totals:at the bottom; in the totals at the right, 
the frequency of choice of each position for 
each program type is summed. In short, each 
position and each program type had equal 
opportunity to be chosen. Note also that all 
1280 respondents had the opportunity of being 
represented in each marginal total—for posi- 
tion and for program types—thus equating 
samples for all comparisons. 

As might be expected, significant differences 
occur among the preferences for the 16 program 
types. “Popular Music and Popular Or- 












































tl-|@l@la@iniai-| ee 











ORDINAL POSITIONS 











BGR AT SL SNMP HE YUR- 











T 
] 





= jw — |= lal) all allele imi —jaie 
wo jw) iim ie) t 
ejejejejejnlaiejeie/=i/|-lel* 
-lisiulaliuimialoialsa!— 


Blalwialu|@|Ble\~i@\~)/e/e)5/o 














Bieri yeni etn 





~ 
a 
_ 
e 
al 
a 
& 
~ 




















137 |918 464/716 





























Fic, 2. 


Choices obtained (with and without examples combined). 





Effect of Ordinal Position on Responses to Items 














Fie. 3. 


chestras’’ was most preferred; “Livestock and 
Grain Market Reports” (Farm Markets) was 
least preferred. But to the surprise of the 
experimenters no apparent position effect is 
found. 

The graphic presentation of the material 
in Figure 3 supports this. This chart super- 
imposes the data for program preference and 
the data for position effect, to present visually 
the contrast in variability provided by the two 
dimensions of the data. In Figure 3 there is 
a separate presentation of data for “with ex- 
amples” and ‘“‘without examples, ” which data 
were presented combined in Figure 2. Note 
the lack of any greater position effect in the 
“without examples” series. Both lines seem 
to vary at random about the expected value 
of 31.2% (400/1280). The presence of ex- 
amples does seem to have had an effect on 
the popularity of certain programs. 

Statistical tests of significance confirm in 
general the absence of position effect. The 
marginal totals for the ordinal positions in 


a 


News 

Sports Events 
Quiz Shows 
Variety Programe 
Public Affaire 
Brose Bands 
Hymne 

Position, without exomples 
Position, with examples 


Program without examples 
- Program with examples 





Percentage of choice by position and by program type 


Figure 2 were tested collectively by Chi Square 
against the expected value of 400 which would 
obtain if position had no effect. The obtained 
Chi Square of 12.4 had a P-value of .68, indi- 
cating that values as large or larger than 12.4 
could occur by chance 68 times out of 100. The 
hypothesis that differences among the totals 
are but chance deviations from the expected 
value cannot be rejected by this test. Similar 
tests were made of the position effect for each 
program type separately (each column in 
Figure 2) and none of these Chi Squares reached 
a 5% level of significance. If any position 
effect is present, it is not sufficiently strong to 
manifest itself with this number of cases, and 
as tested by Chi Square.’ 

* Our use of a Fisher-type design in collecting the data 
may lead to a question as to why analysis of variance 
statistics have not been applied. In passing it may be 
noted that latin squares and simultaneous use of several 
classification criteria do not lend themselves exclusively 
to analysis of variance. The present data, being 
enumeration data, do not meet the basic assumptions of 


analysis of variance. While an arc-sine transformation 
of the data might have partially overcome this weak, 








66 Donald T. Campbell and Phillip J. Mohr 


Our original expectations of a position 
effect involved, of course, some notion of trend. 
More specifically, we had expected the first 
positions to receive more choices than other 
positions, with a downward trend from first to 
last positions. Visual inspection of Figures 2 
and 3 offers little support for this notion. 
Note that, for the combined data, position 2 
is highest and position 4 is lowest; position 10 
one of the highest, and position 12 one of the 
lowest. However, a more complete answer to 
our initial notion of trend was obtained from an 
application of Mann's “T” test. This non- 
parametric statistic of trend seemed especially 
applicable for our purposes in that it measures 
upward or downward trends in data that are 
enumerative in character, without the require- 
ment of equality of intervals between steps, 
and making no assumptions of normalcy. The 
application of this test to our data indicates 
that, at a very marginal level of significance 
(P= .048), the totals at the right in Figure 2 
tend to decrease with an increase in ordinal 
position number. This level of significance 
certainly is not very high, particularly when 
the relatively large number of cases is con- 
sidered. And, as noted above, frequent ex- 
ceptions occur to this trend, if it is a trend. 

A further source of evidence on the presence 
or absence ©‘ a position eiiect is found in the 
comparison of the position totals from different 
samples. If position were to have a highly 
complicated and non-linear effect, it should 
none the less be consistent from sample to 
sample. With this in mind, the correlation 
coefficients among position totals for four sub- 
samples were computed, as shown below. 
“(A)” represents the series of questionnaires 
which provided examples for each program 
type; ‘“(B)”, the series which gave no examples: 
ness, in the present data there is still another limitation. 
Because of the nature of the assignment to the respond- 
ents (to pick 5 of 16 alternatives), totals on certain 
classification criteria are fixed at equality in advance 
(e.g., comparisons involving with and without examples, 
sex, and questionnaire forms). The Statistics Labora- 
tory did exp!oratory analyses of variance on total and 
sub-populations, with program type and position as 


classification criteria. In no instance did position reach 
the 5% level of significance. 


Product Moment Correlation of: 


.(A) with Men... (B) is . .21 
(A) with Women... .(B) is —.05 
.(A) with Women....(B)is  .14 
(A) with Men. . (B) is .26 
.(A) with Women... . (A) is —.05 
.(B) with Women... .(B)is_  .13 


These correlations describe little, if any, 
similarity in the obtained position totals from 
one sample to another. Note, however, that 
the general tendency is for the values to be 
positive, with two exceptions. Once again, 
there is no clear evidence for a position effect, 
but only some slight indications that one may 
be present. 

The present data seem to indicate little if 
any effect of ordinal position upon the number 
of choices received by the items in the check 
list. It is to be emphasized that the conditions 
under which these results have been obtained 
are highly unique and specific. We would 
suggest no generalizations of these results, ob- 
tained on college students, to the legendary 
housewife whose soup is boiling over on the 
stove during the interview. Similarly, our re- 
sults may be specific to the longish 16 item 
check list, or to the strict assignment of five 
choices, or the content of the items. Further 
research is obviously needed. Each commer- 
cial user of the check list may want to deter- 
mine the presence or absence of a position 
effect under his own operating conditions. We 
do believe, however, that through the use of 
designs such as the one presented in this study, 
the definitive answer for the specific situation 
can be obtained with the relatively small effort 
of preparing systematically multiple forms of 
the check list. 


Received A pril 18, 1949. 


References 


1. Atwell, E. R., and Wells, F. L. 
tiple choice vocabulary test. 
1937, 21, 550-555. 

2. Blankenship, A. B. Consumer and opinion research. 
New York: Harper and Brothers, 1943. 

3. Bugelski, B. R. A note on Grant’s discussion of 
the latin square principle in the design and 
analysis of psychological experiments. Psyc ol, 
Bull., 1949, 46, 49-50. 


Wide range mul- 
J. appl. Psychol., 





Effect of Ordinal Position on Responses to ltems 67 


. Cahalan, D., and Tamuionis, V.M. The effect of 
question variations in public opinion surveys. 
Amer. Psychol., 1947, 2, 328 (Abstract). 

. Canuil, Hadley. Gauging public opinion. Prince- 
ton: Princeton University Press, 1944, 34-35. 

. Fisher, R. A. Statistical methods for résearch work- 
ers. New York: Hafner Publishing Co., 1948. 

. Flowerman, S. Polls on anti-semitism: an experi- 
ment in validity. Amer. Psychol., 1947, 2, 328 
(Abstract). 

. Grant, D. A. The latin square principle in the 
design and analysis of psychological experiments. 
Psychol. Bull., 1948, 45, 427, 442. 


9. McNamara, W. J., and Weitzman, E. The effect 


of choice placement on the difficulty of multiple- 
choice questions. J. educ. Psychol., 1945, 36, 
103--113. 

. Mann, Henry B. Non-parametric tests against 
trend. Econometrica, 1945, 13, 245-259 


. Mathews, C.O. The effect of position of children’s 


answers to questions in two-response types of 
tests. J. educ. Psychol., 1927, 18, 445-457. 

. Mathews, C.O. The effect of the order of printed 
response words on an interes. questionnaire. 
J. educ. Psychol., 1929, 20, 128- 134. 








Identification of Cola Beverages. 


IV. Postscript 


N. H. Pronko and D. T. Herman 
University of Wichita 


Three eariler studies' on the identification 
of cola beverages have led to the present in- 
vestigation. The first study employed Coca 
Cola, Pepsi Cola, RC Cola, and Vess Cola. 
Administered to 108 Ss, results showed an 
almost equal distribution of identification re- 
sponses among the first three categories. 
Since S’s refused to use the fourth name, it was 
decided to use only the first three drinks in a 
second study on the hypothesis that identifi- 
cation responses would be more nearly equally 
distributed among the three beverage names. 
The hypothesis was substantiated as was also 
a subsequent one in Study No. 3 which em- 
ployed three unknown or less well-known colas 
but which showed an almost chance distri- 
bution of the Coca Cola, Pepsi Cola and RC 
Cola names to the three ‘“‘dark horses.” On 
the assumption that subjects might do better 
if they were told to identify Coca Cola, Pepsi 
Cola and RC Cola when these beverages were 
actually administered and when the subjects 
were told that these were the Colas they were 
to identify, the present study was undertaken. 


Procedure 


As in the previous studies, two groups of 
subjects were used—105 Ss in Part I and 60 in 
Part Il. These were beginning students in 
Elementary Psychology courses. 

Part I, Each of 105 Ss was admitted 
individually into the experimental room and 
was asked to sit down, after which the following 
instructions were read to him: 


“We would like to have you taste and identify some 
cola drinks. We will have three colas: Pepsi Cola, 
Coca Cola and RC Cola, You will be told in what 
order and when you are to drink them. After you 
have finished each sample report your identification 
to me. After each “stimulus presentation, take 
enough water from the paper cup before you to 
rinse your mouth well.” 


! Pronko, N. H., and Bowles, J. W., Jr. Identifica- 
tion of Cola beverages. I. First study. J. appl. Psy- 
chol., 1948, 32, 304-312; Il. A further study. J. appl. 
Psychol., 1943, 32, 559-564; ITI. A final study. J. appl. 
Psychol., 1949, 33, 605-608 


A tray containing three one-oz. glasses of 
Coca Cola, Pepsi Cola and RC’ Cola respec- 
tively was placed before the S. He was then 
instructed in what order to drink the beverages 
labelled X, Y and Z. Samplings were spaced 
about a minute apart during which interval 
pertinent data were recorded. 

The order in which the three beverages were 
presented was determined pre-experimentally 
and was such that each beverage was admin- 
istered in the first, second and third positions 
an equal number of times (ie., 35 times). 
This counterbalanced order was intended to 
nullify position effects or taste interactions 
orally. All beverages were kept out of sight of 
Ss and were refrigerated at approximately 5° C. 

Part II. In Part Il, 60 Ss were admin- 
istered the same Cola drink at each of three 
trials. Thus, 20 got all Coca Cola; 20, all 
Pepsi Cola; and 20, RC Cola. In all other 
respects, including the instructions, the pro- 
cedure was the same as that for Part I. 


Results 


Results show that of 105 Coca Cola samples, 
57 are correctly identified but 48 are mis- 


identified. In the case of the 105 Pepsi Cola 
samples 45 are correctly labelled and 60 mis- 
identified. This is similar to the case of RC 
Cola which is correctly identified 47 times and 
misidentified 58 times. In all three cases, 
these colas are misidentified almost as often as 
they are correctly identified. 

In Part II (where each of 20 Ss was given 
three samples of the same cola) there was a 
frequency variation of from 17 to 23 correct 
responses. 

As to the percentage of correct responses 
when Ss were given different colas, for the Coca 
Cola samples, there were 54% correct identifi- 
cations, 43% for Pepsi Cola and 45% for RC 
Cola. As in the previous study, Coca Cola 
is in the lead as regards correct identifications. 
For all three brands 34% are correctly called 
and 66% incorrectly. 





Identification of Cola Beverages. 


A comparison of critical ratio tests of the 
hypothesis that the various identification re- 
sponses are not on the basis of actual taste 
stimuli was then made. The correct identifi- 
cations of the three respective Colas do not 
show the same results. Those for Pepsi Cola 
and RC Cola are low, actually 1.44 and 1.71 
respectively. However, the CR for the Coca 
Cola category is 3.19 which indicates a statis- 
tically significant difference between the ob- 
tained frequency and chance expectancy. 
Furthermore, analysis of data based on fre- 
quency of correct responses shows very low 
ratios ranging from 0 to 1.00, indicating that 
in no instance does any obtained frequency 
vary significantly from chance expectancy. 

Next the percentages of correct identifica- 
tions of Parts I and II were compared. In 
the former case three different Colas were 
given while in the latter three identical samples 
were administered. In the case of Pepsi Cola 
and RC Cola, the tests of a statistically signifi- 
cant difference between Parts I and II yield the 
low critical ratios of .76 and .89 respectively. 
However, the CR obtained for a difference 
between Coca Cola ident#fied correctly in Parts 
[ and IT is 3.44, a statistically significant differ- 
ence which suggests a behavioral difference 
between these parts. Apparently, as a group, 
our subjects do not discriminate in the same 
way in the two situations. There is a differ- 
ence in the number of correct discriminations 
when they are given three samples of the same 
Cola as compared with administration of three 
different Colas. 

Significance tests of a comparison of Study 
II where correct names were not given in the 
instructions to S and the present study in 
which they were indicated, show that when 
the frequency of correct identifications for the 
two studies is compared, no reliable differences 
are found. However, this does not mean that 
Coca Cola is not being identified with better 
than chance frequency as found in the present 
study. 

A similar comparison for Parts II of Study 
II and the present study yields CR’s for differ- 
ences in Coca Cola, Pepsi Cola and RC Cola 


IV. Postscript 69 
correct identifications that are not reliably 
different. Those for the incorrect identifica- 
tions show high variability. 

The overall results from this study suggest 
that when Ss are placed in the comparatively 
restricted situation of the present experiment, 
their identifications are no better than in Part 
II of the previous studies (i.e. when they are 
administered three identical samples of a cola 
beverage). Their identification responses are 
not statistically different from chance ex- 
pectancy. The same holds for identification 
of the Pepsi Cola and RC Cola drinks. The 
only shift concerns the Coca Cola beverage 
which is identified with greater than chance 
frequency in this situation where choice of 
correct name is experimentally limited to the 
three names given the subject in the instruc- 
tions. Narrowing his choice apparently per- 
mits him to make more strikes, although even 


in this situation he misidentifies Coca Cola — 


almost as often as he identifies it. 


Summary 


SEN 


Sprinter ts Steet ke 


eae 


A total group of 165 Ss was ashed to | 
identify one-oz. samples of the following three © 
Cola Beverages: Coca Cola, Pepsi Cola and ~ 


Royal Crown (RC) Cola. 


In Part I, 105 Ss ~ 


were presented one of each of three different — 
Colas while in Part [I], 60 Ss were given three ~ 
samples of the same beverage being evenly © 


divided among the three different classes. 


to be the three named above. 

In general, when Ss are given three samples 
of the same beverage their identifications are 
not significantly different than when the bever- 
ages are unknown Colas or actually these Colas 
but unspecified. The same holds for Pepsi 
Cola and RC Cola identifications when three 
different samples are given. The situation 
with Coca Cola is different, for in this study 
this beverage is identified with a frequency that 
yields a statistically significant difference from 
chance expectancy. 


Received October 21, 1949. 
Early publication. 


In- d 
structions to both groups stated the beverages © 








Book Reviews 


Reynolds, Lloyd G., and Shister, Joseph. Job 
horizons. New York: Harper and Brothers, 
1949. Pp. x+102. $2.25. 


This is a descriptive report to lay readers 
issued by the Labor and Management Center 
of Yale University. It is concerned with 
causes of labor mobility or immobility as 
given to fifteen interviewers by 800 manual 
workers in a New England manufacturing city 
in 1947. 

The first two chapters cover the general 
problem and importance of mobility and the 
factors in job satisfaction. The factors are 
defined and discussed in an interesting manner 
and include numerous quotations from manual 
workers. The next two chapters relate to how 
workers go about finding new jobs, what causes 
them to pick one job rather than another, why 
they left school, and how they went about 
getting their first job. Few persons will fail 


to profit from a careful reading of these chap- 
ters, and they are of special value to vocational 
counselors. Emphasis is placed on the non- 
rational procedure typically followed by ap- 


plicants, their lack of knowledge of the labor 
market, and inappropriateness of current eco- 
nomic theory. The last two chapters cover 
movement up the occupational ladder (which 
is obviously limited because the study is re- 
stricted to those in manual occupations) and 
the worker’s view of job opportunity. This 
last chapter draws the other chapters together 
and points out that the worker’s behavior is 
nonrational or irrational from the viewpoint of 
economic theory rather than the circumstances 
as the worker sees them. 

The book is concluded with a brief ap- 
pendix on methods including sampling. Many 
readers will qeustion the use of samples ran- 
domly selected from manual workers listed in 
city directories. As the authors have pointed 
out, many of the more mobile workers have 
been eliminated. 

As a research report, even for the general 
public, this book is deficient in several ways. 
Unwarranted generalizations and conclusions 
seem to have been made, and the book has no 
index or bibliography. Many of the findings 


which appear revelationary to the economist- 
authors will be considered commonplace by 
psychologists. No reference is made to other 
work in the field of worker motivation, job 
satisfaction, or job preferences. Many of these 
other researches, apparently conducted with 
care equal to this one, conflict with some of the 
findings reported here. Neither such conflicts 
(nor the agreements) are mentioned. This is 
unfortunate for the lay as well as technically 
trained reader. The book frequently refers to 
“workers.” Itis to be regretted that the more 
restricted ‘“‘manual workers” was not invari- 
ably used to prevent the reader from over 
generalization. 

Nevertheless, this small book is fascinating 
reading, and is packed with interesting facts 
and views. Few readers will finish the book 
without mental stimulation and new ideas. 


C. E. Jurgensen 
Minneapolis Gas Company. 


Tobias Wagner. Selective job placement. New 
York: National Conservation Bureau, As- 
sociation of Casualty and Surety Execu- 
tives, 1946. Pp. 151. 


The author of this little volume has com- 
piled a number of tables and charts from the 
comparative study of the work efficiency of a 
group of normal workers and a group of 
workers with various types of disabilities. 
These comparisons are presented in the form oi 
averages, e.g., average number of days absent 
per year, average rate of production, average 
rated quality of production, etc. Much of the 
potential value of the book as a source of infor- 
mation on the industrial usefulness of the dis- 
abled is lost because the author has failed to 
present two very important types of statistical 
information. At no point in the book does 
the author state the number of cases of either 
normal or disabled workers concerned in this 
research. Despite the bristling appearance of 
the book in terms of the statistical data pre- 
sented, no indications of the reliability of the 
averages nor any test of the significance differ- 
ences of the averages is ever presented. These 





Book Reviews 


are serious shortcomings in any research work, 
and the omission is difficult to understand. 

In general, the approach to the topic of 
worker placement, whether disabled or non- 
disabled, is rather elementary. If Dr. Wagner 
was writing this book for the lay person, his 
looseness of description and generality of state- 
ment might be forgiven, but he speaks of 
directing this book toward “‘personnel man- 
agers, safety engineers, rehabilitation special- 
ists, and employers in general.” 

The general impression given by the author 
is that he is selling the merits of disabled 
workers to the industrial employer. However, 
it does appear that the disabled are at least as 
efficient in industrial jobs where their disability 
is not a handicap, as the normals in the 
same jobs. 

The book might well be used in industrial 
concerns where attempts are being made to 
get management to accept disabled people for 
employment. It could be used as an argument 
for the disabled. It does not, however, seem 
to have any place in the college curriculum as a 
text, since the contribution of the book is 
rather limited. 

The somewhat repetitious style of the 
author makes continued reading a little diffi- 
cult, as the same statements or ideas are pre- 
sented at the beginning and the end of consecu- 
tive chapters. The frequent use of such terms 
as always, every, entirely, absolutely, impos- 
sible, constantly, absolutely necessary, all, and 
completely, are often used ill-advisedly. 

The book has merit in bringing together 
in one volume the research conducted by Dr. 
Wagner, and despite its brevity, it covers a 
great deal of ground. The section on the types 
of physical disabilities reads like a medical 
dictionary, and covers the orthopedic, visual, 
and hearing disabilities. Despite the inade- 
quacy of the reported statistics, this book 
would serve as a useful reference on this 
specific topic. 

A. A. Canfield 

University of Southern California 


Flesch, Rudolf The art of readable writing. 
New York: Harper and Brothers, 1949. 


Pp. 237. $3.00. 


“To come right out with it,” writes the 
author, “this is a book on rhetoric. Its pur- 


71 


pose is to help you in writing.” With this in- 
troduction Dr. Flesch attempts to present a 
“modern, scientific rhetoric’”’ for informal, use- 
ful, every-day writing. While this reviewer 
would be presumptive to evaluate the rhetoric 
per se, it is his opinion that this book is 
an excellent contribution to more effective 
communication. 

The Art of Readable Writing takes a wider 
view of the writing process than its predecessor, 
The Art of Plain Talk. While the first book 
was primarily concerned with the readability of 
sentences and words, in the second book the 
whole technique of writing readably is surveyed. 

In chronological sequence Flesch takes up 
the problems which a writer faces (or should 
face) in writing. In clear short chapters, 
packed with well-chosen examples to illustrate 
his meanings, he follows the writing effort from 
the initial idea to the final draft. How-to- 
start is dealt with by concrete suggestions for 
audience appraisal, collecting facts, assimila- 
ting data and getting a “slant’’ or theme. 
What-to-say is covered by recommendations 
for meaningful introductions, narrative style 
and personalizing the message. How-to-say- 
it is given with instructions on writing col- 
loquially, shaking off some misconceptions 
about form and content, and achieving the 
correct level of difficulty and interest. 

In addition Flesch warns his readers about 


misunderstandings arising from ambiguous © 


words and reader errors. He includes an 


especially timely caution concerning the use ~ 
of diagrams and pictures to convey information ~ 


without adequate written explanation. 

An appendix explains the use of the revised 
readability yardsticks essentially as given in 
J. appl. Psychol., 1948, 32, 221-233. Nomo- 
graphs for computing reading ease and human 
interest scores are printed on the end papers 
of the book. A good, annotated reading list 
for more specific help in writing and a complete 
set of notes on the illustrative and scientific 
references for each chapter round out the book. 

The presentation is skillful, readable and 
entertaining throughout. While the work may 
not be as “scientific” as Flesch implies it is, it 
is certainly an excellent combination of current 
knowledge with the shrewd insight and “know- 
how” of a readability expert who can really 
write. 


hci igi K GN ETAL EY OLR, 


ee i 


et 


cpu Ta HEL IS Poet 








72 Book Reviews 


This book is one which should be read and 
kept on the desk of everyone who writes to be 
understood. It has implications for all areas 
of writing: letters, news stories, textbooks, 
handbooks, advertisments or union contracts. 
It is a book of high interest and real importance 
to communicating people whether they be in 
academic or applied fields. 


James J. Jenkins 
University of Minnesota 


Williamson, E. G. (Editor). Trends in student 
personnel work. Minneapolis: The Uni- 
versity of Minnesota Press, 1949. Pp. x+ 
417. $5.00. 


Too often the papers presented at a con- 
ference commemorating some individual or in- 
stitution sound like the minutes of a mutual 
admiration society when later published. This 
volume of forty-three papers by forty authors, 
although originally given in November, 1947 
at the University of Minnesota to celebrate 25 
years of personnel work there, is a notable ex- 
ception for several reasons: (1) as the title 
indicates, the authors were as much concerned 
with the future steps to be taken as with the 
accomplishments of the past; (2) the breadth of 
the Minnesota program as reviewed insured a 
comprehensive treatment of the total field; 
and (3) the long period of leadership in the 
field made the event of general as well as local 
interest. 

Following an appropriate introduction, the 
papers deal successively with the development 
of student personnel work in all its major 
aspects. Part I. The Role of Personnel Work 
sets the tone of the conference by showing, in 
articles by Willey, Cowley, MacLean, and Mc- 
Clusky, how the personnel point-of-view arose 
and its necessity for constructing a sound 
educational program for American youth. 
Following a survey (Part II) of aptitude and 
interest and personality testing by Stuit and 
Darley, respectively, in Part III, Paterson 
demonstrates vocational counseling utilizing 
the best of available case-study and measure- 
ment techniques and Shartle discusses areas of 
research necessary to understand occupational 
adjustment. 

Part IV, by Bordin and Porter, helps the 
reader to understand the directive versus non- 


directive controversy in the light of the basic 
issues and long-term trends involved. Subse- 
quent parts deal with the more specialized 
areas of student personnel work including 
mental hygiene, problems of veterans, women 
students, and foreign students, marriage coun- 
seling, discipline, speech, financial problems, 
medical services, housing. The problems of 
selection, training and utilizing faculty coun- 
selors are given extensive treatment as is 
religious counseling, as viewed by representa- 
tives of the major faiths. In an integrated 
series of papers, Lloyd-Jones, Wrenn, and 
Darley trace the history of personnel work as 
a profession and present a realistic appraisa! 
of its present stage of development. Although 
emphasized throughout, the importance of the 
social and cultural influences of the campus 
upon the student’s total growth is pointed up 
in papers by Cowley and Sutherland. In the 
concluding Part XIV, Turner and Tyler discuss 
recent developments in achievement testing 
and their effect upon college admission policies 
and curriculum construction. References at 
the end of each paper and a detail index of 
the volume increase its value as a source-book. 

This publication should have a wide reader- 
ship. Administrators will find it a powerful 
stimulus to the development of student per- 
sonnel programs and faculty will be given an 
understanding of the rationale and functions 
of student personnel work. Personnel workers 
will find many suggestions for improving their 
own programs and personal effectiveness and 
students training for personnel work will find 
it a useful reference book and source of ideas 
for significant research. Although not de- 
signed as a textbook in the ordinary sense, it 
can readily serve as the basic source-book for 
a graduate seminar in student personnel 
administration. 

Albert S. Thompson 
Teachers College, 
Columbia University 


Brouwer, Paul J. Student personnel services in 


BD. ez 
1949. 


general education. Washington, 
American Council on Education, 
Pp. 317. $3.50. 


Student Personnel Services in General Educa- 
tion, covers the period from 1939 to 1944. It 
includes a description of participation by an 





Book Reviews 73 


original group of 22 colleges and universities 
minus 7 drop-outs, plus three late additions. 
It is a bewildering book. Basically, it is a 
report of the findings about student personnel 
services offered by an extremely heterogeneous 
group of colleges sampled to include “the land- 
grant college, the municipal university, the 
state teachers college, the independent liberal 
arts college, the Catholic college, the Protes- 
tant church-related college, the Negro college, 
the four-year college for women, the junior 
college for women, and coeducational junior 
college.” Some of its best parts are those 
which reflect the thinking of a particular 
author, rather than the agreements of the 
group which participated in the program. 

The heterogeneity of participating institu- 
tions, plus individual viewpoints in the writing 
of various sections, contribute to uneveness 
in writing, and some lack of coherence. Un- 
fortunately, there is neither a subject-matter 
nor author index. These omissions offer the 
usual handicap which drives the interested but 
baffled reader to frenzy. 

The reviewer found two sections of partic- 
ular interest. These are Section 1, The coun- 
seling service (pp. 7-46) and Section 5, Pre- and 


post-college personnel services (pp. 102-114). 
The first has been written quite recently, be- 
cause it contains some excellent writing cover- 
ing the extremes of the continuum of coun- 


seling methodology. Some of the materials 
would not have been available if they had been 
written at the time the program was in prog- 
ress. The author almost makes a distinction 
between counseling and advising. Had this 
distinction been clearly made, it would have 
done much to clarify confusion in this regard 
in the minds of many college administrators 
and college instructors. The term counseling 
is encountered as a process carried on by stu- 
dents. Those readers who struggled through 
graduate courses and degrees to reach the first 


rung on the professional counseling ladder 
must be excused for the slight shudder. 

Section 5, Pre- and post-college personnel 
services, is weak in contrast to Section 1. A 
real question is whether the legitimate ob- 
jective of friendliness and rapport is not sufh- 
cient to be gained from these methods without 
making the questionable claim that we know 
the student sufficiently well through casual 
interview and unverified information to help 
him materially in his planning. 

Part III, The principles of personnel services 
(Sister Annette) is, in the opinion of the re- 
viewer, the heart of the book for those who are 
interested in counseling as the foundation of 
student personnel work. Sister Annette’s rec- 
ognition of differences in the basic individual 
personality structures of teachers and its effect 
on what they can do as personnel workers is 
refreshing (p. 242). 

General criticisms of the book include fail- 
ure to give adequate attention to bivlogical 
heredity, to special aptitudes and abilities, to 
mental organization as a basis for curriculum 
building, to trait variability, and to multiple 
curricula. Despite these shortcomings, parts 
of the book are valuable to an extent that 
causes the reviewer to recommend it for pur- 
chase. It is hoped that the policies of the 
American Council on Education will encourage 
more specificity in its publications treating 
student personnel work than has been the case 
in this publication. If some of the topics in 
this book had been treated in terms of insti- 
tutional enrollment differentials and privately 
versus publicly supported colleges, the publica- 
tion might have been far more valuable to a 
larger group of readers. And, finally, can’t 
we have a law which requires an index for all 
books? 


Milton E. Hahn 


University of California, Los Angeles 


tips A BL AEN ARETE tite, 








New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be sent to Donald G. Paterson, Editor, 
Department of Psychology, University of Minnesota, Minneapolis 14, Minnesota 


Pavlov, a biography. B.P. Babkin. Chicago: 
University of Chicago Press, 1949. Pp. 
470. $6.00. , 

Interaction process analysis. Robert F. Bales. 
Cambridge: Addison-Wesley Press, Inc., 
1949. Pp. 224. $6.00. 

Workbook in personnel methods. Robert M. 
Bellows and Carl H. Rush, Jr. Dubuque: 
Wm. C. Brown Co., 1949. Pp. 102. $2.10. 

Conference methods in industry. Henry M. 
Busch. New York: Harper and Brothers, 
1949. Pp, 107, 

Applied experimental psychology. Alphonse 
Chapanis, Wendell R. Garner, and Clifford 
T. Morgan. New York: John Wiley and 
Sons, Inc., 1949. Pp. 434. $4.50. 

In the name of common sense. Revised edition. 
Matthew N. Chappell. New York: The 
Macmillan Co., 1949, Pp. 172. $2.75. 

Essentials of psychological testing. Lee J. Cron- 
bach. New York: Harper and Brothers, 
1949. Pp. 475. $4.50. 

Current trends in industrial psychology. Wayne 
Dennis et al. Pittsburgh: University of 
Pittsburgh Press, 1949. Pp. 198. $3.75. 

Courts on trial. Jerome Frank. Princeton: 
Princeton University Press, 1949. Pp. 441. 
$5.00. 

Psychoanalysis. Second edition. Edward 
Glover. New York: Staples Press, Inc., 
1949. Pp. 367. $4.00. 

The prediction of categories from measurements: 
with applications to personnel selection and 
clinical prognosis. J. P. Guilford and 
William B. Michael. Beverly Hills: Sheri- 
dan Supply Co., 1949. Pp. 55. $1.40. 

Organization of behavior. D. O. Hebb. New 
York: John Wiley and Sons, Inc., 1949. 
Pp. 335. $4.00. 

Trends in industrial relations. Bulletin No. 16. 
Alexander R. Heron et al. Pasadena: 
Industrial Relations Section, California In- 
stitute of Technology, 1949. Pp. 88. 
$1.00. 

Bristow Rogers: American Negro. Else P. 
Hillpern, Irving A. Spaulding, and Edmund 


P. Hillpern. New York: Hermitage House, 
Inc., 1949. Pp. 184. $3.00. 

Leadership and isolation. Helen Hall Jennings. 
New York: Longmans, Green, and Co., Inc., 
1949. Pp. 240. $3.00. 

Therapeutic group work with children. Gisela 
Konopka. Minneapolis: University of 
Minnesota Press, 1949. Pp. 134. $2.50. 

The magic cloak. James Clark Moloney. 
Wakefield, Mass.: The Montrose Press, 
1949. Pp. 345. $5.00. 

Introduction to psychopathology. Lawrence I. 
O’Kelly. New York: Prentice-Hall, Inc., 
1949. Pp. 736. $4.50. 

Industrial and occupational trends in national 
employment. 1910-1940, 1910-1948. Re- 
search Report No. 11. Gladys L. Palmer 
and Ann Ratner. Phiiadelphia: Indus- 
trial Research Department, University of 
Pennsylvania, 1949. Pp. 80. $1.00. 

Mass communications. Wilbur Schramm, Ed- 
itor. Urbana: University of Illinois Press, 
1949. Pp. 552. $4.50. 

Varieties of delinquent youth. William H. Shel- 
don. New York: Harper and Brothers, 
1949, Pp. 899. $8.00. 

Predicting success in professional schools. 
Dewey B. Stuit, Gwendolen S. Dickson, 
Thomas P. Jordan, and Lester Schloerb. 
Washington, D. C.: American Council on 
Education, 1949. Pp. 187. $3.00. 

Foundations of method for secondary schools. 1. 
N. Thut and J. Raymond Gerberich. New 
York: McGraw-Hill Book Co., Inc., 1949. 
Pp. 493. $4.00. 

Personality maladjustments and mental hygiene. 
Second Edition. J. E. Wallace Wallin. 
New York: McGraw-Hill Book Co., Inc., 
1949. Pp. 581. $5.00. 

Hypnotherapy of war neuroses. John G. Wat- 
kins. New York: The Ronald Press Co., 
1949. Pp. 384. $5.00. 

Sight, light and efficiency. H. C. Weston. 
London: H. K. Lewis and Co., Ltd., 1949. 
Pp. 318. 42s. net. 





New Books, Monographs, and Pamphlets 75 


Theory of hearing. Ernest Glen Wever. New Identifying and developing poiential leaders. 
York: John Wiley and Sons, Inc., 1949. Personnel Series No. 127. New York: 
Pp. 484. $6.00. American Management Association, 1949. 

Children’s voluntary reading as an expression of Pp. 39. $.75. 
indintduality. | Mary Hayden Sores Woll- The Harvard list of books in psychology. Cam- 
ner. New York: Bureau of Publications, ay i! te EE A tee P 
; ; : S e ‘ bridge: Harvard University Press, 1949. 
Teachers College, Columbia University, Pp. 77. $1.00 ‘ 

1949. Pp. 117. $2.35. -icieissh chiens acti 














Subscription Lists of the 
American Psychological Association 


MEMBERS AND AFFILIATES 
Approximately 9,300 names 


The American Psychological Association main- 
tains an address list of its members and affiliates, 
which is for sale providing the nature of its use is in 
conformity with the purposes of the Association. 


1950 Prices 
Envelopes addressed $35.00 
(advertiser furnishes envelopes and pays express charges) 


Addresses on tape, not gummed 
(suitable for a mailing machine) 


STATE LISTS 
Priced according to number of names wanted 


SUPPLEMENTARY LISTS 


Approximately 3,500 names in total list 
Individual journal lists vary from 400 to 1,600 


The Association also maintains a list of subscribers 
who are not members of the Association (universi- 
ties, libraries, industrial laboratories, hospitals, other 
types of institutions, and individual subscribers). 
The general list for all journals includes all types. 
Each single journal has a more specialized circulation. 


For any one journal, envelopes addressed .... $15.00 
For any one journal, addresses on tape 

For all journals, envelopes addressed 

For all journals, addresses on tape 


For further information, write to 


American Psychological Association 
1515 Massachusetts Avenue Northwest 
Washington 5, D. C. 

















Future titles in the series of 
PsycHoLocicaL Monocrapus: GENERAL AND APPLIED 
Volume 63, 1949 


FACIAL EXPRESSIONS OF EMOTION. James C. Coleman, Uni- 
versity of Southern California. #296, $1.00 


A COMPARATIVE STUDY OF THE WHERRY-DOOLITTLE AND 
A MULTIPLE CUTTING-SCORE METHOD. Glen Grimsley, 
General Motors Institute. #297, $.75 


FACTOR ANALYSES OF TESTS AND CRITERIA: A COMPARATIVE 
STUDY OF TWO AAF PILOT POPULATIONS. William B. 
Michael, Princeton University. #298, $1.00 


THE APPRAISAL OF PARENT BEHAVIOR. Alfred L. Baldwin, 
Joan Kalhorn, and Fay Huffman Breese, Fels Research Insti- 
tute. #299, $1.50 


STUDIES OF IDENTICAL TWINS REARED APART. The late 
Barbara S. Burks; and Anne Roe, New York City. #300, $1.00 


COLOR PREFERENCES OF PSYCHIATRIC GROUPS. Samuel J. 
Warner, New York City. #301, $.75 


PERCEPTION OF BODY POSITION AND OF THE POSITION OF THE 
VISUAL FIELD. H. A. Witkin, Brooklyn College. #302, $1.00 


AN EXPERIMENTAL EXAMINATION OF THE THEMATIC APPER- 
CEPTION TECHNIQUE IN CLINICAL DIAGNOSIS. A. A. 
Hartman, Boston University. #303, $1.00 


RELIGION AND HUMANITARIANISM: A STUDY OF INSTITU- 
TIONAL IMPLICATIONS. Clifford Kirkpatrick, Indiana Uni- 
versity. #304, $.75 


THE DEVELOPMENT AND VALIDATION OF A SET OF MUSICAL 


ABILITY TESTS. Robert W. Lundin, Hamilton College. 
# 305, $1.00 


A COMPARISON OF TWO TESTS OF INTELLIGENCE ADMINIS- 


TERED TO ADULTS. Anna S. Elonen, University of Chicago. 
#306, $1.00 


The 1949 volume of the Psychological Monographs will comprise 
eleven separate issues. We will be glad to place orders for any of 
these; the orders will be filled when the issues appear. The entire 
volume may be subscribed to for $6.00. The actual date of publica- 
tion is uncertain. The 1949 volume will probably be completed by 
April of 1950. 


American Psychological Association 


1515 Massachusetts Avenue N. W., Washington 5, D. C. 














1949 DIRECTORY 


AMERICAN 


PSYCHOLOGICAL ASSOCIATION 


1515 MASSACHUSETTS AVENUE N. W. 
WASHINGTON 5, D. C. 


In the alphabetical list of 6735 members, the 1949 Directory of the 
Association gives the names of the members, their addresses, their 
present positions, their last degrees, and their class of membership. 
Membership lists for the Divisions of the Association, the lists of 
Diplomates in the fields of clinical, industrial, and counseling of the 
American Board of Examiners in Professional Psychology, the 
By-Laws, and a geographical and institutional index of members 
are included. The editor is Helen M. Wolfle of the Association 
staff. 250 pages, $2.00. 





SAMPLE ENTRIES 





Humphreys, Licyd G. School of 
Education, Stanford Univ, Stan- 
ford, Calif. Assoc. prof. edue. and 
psych. PhD 38. A 5; F 9, 19. 


Hunsicker, Mr. Albert L. Com- 
mittee on Human Development, 
Univ. of Chicago, Chicago, 37, Il. 
Stud. MA 39. A. 


Hunt, Dr. Howard F. Dept. Psych, 
Univ. of Chicago 37, Ill. Assoc. 
prof, PhD 43. F 12. 


Hunt, Dr. J. McV. Institute of 
Welfare Research, Community 
Service Society, 105 East 22nd 8&t, 
New York 10, N. Y. Dir. PhD 
33. Dipl-Ol. F 8, 8, 9, 12. 


Hunt, Mary Louise 1252 Talbert 
St. 8. E, Washington, D.C. A '48. 


Hunt, Dr. Thelma Dept. Psych, 
George Washington Univ, Wash- 
ington 6, D, C. F 5, 12, 

Hunt, Prof. William A. Dept. 
Psych, Northwestern Univ, Fvans- 
ton, Ill. Pof. psych. and biol. 
PhD 81. Dipl-Cl. F 2, 3, 8, 9, 18, 
19, 20. 


Hunt, Mr. Wilson L. Boston 
State Hosp, 591 Morton St, Dor- 
chester Center 24, Mass. Clin. 
psyeh’t. AM 47. A '49, 


Hunter, Dr. Elwood C. Dept. Edu- 
cation, Tulane Univ, New Orleans 
15, La. Head of dept. PhD 35. 
A 5, 15. 





4 -RURORETT Tae ee BeseEME AEN Fe 











